Note: I've just migrated to a different physical server to run Spivey's Corner,
with a new architecture, a new operating system, a new version of PHP, and an updated version of MediaWiki.
Please let me know if anything needs adjustment! – Mike

Lecture 11 – The interrupt mechanism (Digital Systems)

Copyright © 2024 J. M. Spivey
Jump to navigation Jump to search

Saving and restoring context

Interrupts behave (from the programmer's point of view) as subroutine calls that are inserted between the instructions of a program, in response to hardware events. So that these subroutine calls do not disrupt the normal functioning of the program, it's important that the entire state of the CPU is preserved by the call. This means not only the data registers, but also the address stored in the link register, and the state of the flags. For example, if a leaf routine is interrupted and has its return address stored in the link register, then that return must be preserved for use when the interrupted routine returns. And if an interrupt comes between the two instructions in a typical compare-and-branch sequence:

cmp r0, r1
beq label

then the flags set by the comparison must be preserved by the interrupt so they can be used to determine whether the branch is taken.

On the Cortex-M0, this is done in such a way that the interrupt handler can be an ordinary subroutine, compiled with the same conventions as any other. That means the compiler that translates each subroutine need not know which ones will be installed as interrupt handlers, and also that no assembly language coding is needed to install an interrupt handler. When an interrupt arrives:

  • the processor completes (or, rarely, abandons) the instruction it is executing.
  • some of the processor state is saved on the stack. The saved state consists of the program counter, the processor status register (containing the flags), the link register, and registers r0, r1, r2, r3 and r12.
  • The link register lr is set to a magic value (not a valid address) that will be recognised later.
  • The program counter pc is loaded with the address of the interrupt handler. Each device that can interrupt (up to 48 different ones) has a number, and that number is used as an index into a table of interrupt vectors that (on the Nordic chip) is located in ROM at address 0.

The state is now as shown in the middle diagram below, which shows what happens when a subroutine P receives an interrupt with handler H.

Stack states during interrupt entry

The interrupt handler must obey the normal calling conventions of the machine: if it calls other subroutines (as it may) or uses registers r4-r7 (or, less likely, r8--r11) then it must save lr with its magic value and these registers in its stack frame. So, working together, the hardware and the procedure's prologue save all of r0-r12, plus the pc and lr, and things are arranged so that when the handler returns, it will use the magic value as its return address.

The interrupt handler is now running in a context that is equivalent to the one it would see if it had been called as an ordinary subroutine. It can make free use of registers r0--r3, and can use registers r4--r7 if it has saved them. It can call other subroutines that obey the same calling conventions, and their stack frames will occupy space below its frame and the exception frame on the stack.

When the handler returns, it will restore the values of r4--r11 to the values they had before the interrupt, then branch to the magic value. This signals the processor, instead of loading this value into the pc, to restore the values that were saved by the interrupt mecahnism, and the processor returns to the interrupted subroutine. Global variables may have been changed by the interrupt handler – for example, a character may have been received by the UART and put in a buffer – but all the local state is just as it was before the interrupt handler was invoked.

In each program, the table of interrupt handlers is put together by the linker, guided by the linker script NRF51822.ld, which specifies they should go at address 0. The code in startup.c makes each vector point to a default handler that flashes the Seven Stars of Death, unless a subroutine with an appropriate name such as uart_handler is defined somewhere in the program.

In the Cortex-M0, the management of interrupts is delegated to a semi-detached functional unit called the Nested Vectored Interrupt Controller (NVIC). We need to know about this, because to use interrupts, three separate units must be set up properly: the device, so that it requests an interrupt when certain events happen; the NVIC, so that it enables and gives an appropriate priority to interrupts from the device, and the CPU itself, so that it responds to interrupt requests. The name of the NVIC gives a clue to some aspects of its operation:

  • It supports nested interrupts – that is, each device is given a configurable priority, and during the handling of an interrupt from a low-priority device, interrupts from higher-priority devices may happen. We will not use this, but will soon move to an operating system where most interrupt handlers are very short and just convert the interrupt into a message: this removes the worry that handling one interrupt will block another one for too long. Note that, in any case, a device cannot have higher priority than itself, so the interrupt handler for a device completes before another interrupt from the device is accepted.
  • This (delaying interrupts until the CPU is ready for them) works because the NVIC keeps track of which devices have pending interrupts, and will send requests to the CPU when it can handle them. There is just one pending interrupt for each device, so if we want (for example) to count interrupts as clock ticks, then we must be sure to service each interrupt before the next one arrives.
  • Interrupts are vectored – that is, each device can have a separate interrupt handler, rather than (say) having one handler for all interrupts that must then work out which device(s) sent the request. Note, however, that a device can generate interrupts for multiple reasons, so the handler must then work out why the interrupt happened, and respond appropriately. For example, the UART can interrupt when it is ready to output a fresh character, but also when it has received a character that it is ready to pass the the CPU.
  • Interrupts can be disabled in several ways: all interrupts can be disabled and enabled in the CPU by using the functions intr_disable() and intr_enable() that are aliases for the instructions cpsie i and cpsid i. We can also disable interrupts from a specific device by using the functions disable_irq(irq) and enable_irq(irq). Interrupts for specific events can also be enabled and disabled by using the device registers for a specific device.
  • There are a couple of system-level interrupts (also called exceptions) that are not associated with any particular device. One of these is the HardFault exception that is triggered when a program attempts an undefined action, such as an unaligned load or store. In our software, the handler for the HardFault exception shows the seven stars of death.[1]
  • Another kind of system event is triggered when the processor executes an svc (supervisor call) instruction. When we introduce an operating system, this will mean that there is a uniform way of entering the operating system, whether it is entered because of a hardware event or a software request. On machines (the Nordic chip isn't one of them) where application programs run with less privileges than the operating system, the svc instruction is also the means for crossing the boundary between unprivileged and privileged programs.

Context

The interrupt mechanism of the Cortex-M0 is unusual in obeying its own calling conventions: that is to say, the actions on interrupt call and return exactly match the conventions assumed by compilers for the machine. This makes it possible for interrupt handlers to be subroutines written in a high-level language and compiled in the ordinary way. It's more normal for machines to save only a minimal amount of state when an interrupt happens: perhaps just the program counter and processor status word are pushed on a stack, and the program counter is loaded with the handler address. When the interrupt handler wants to return, it must use a special instruction rti that restores the program counter and status word from the stack, rather than the usual instruction for returning from a subroutine. If we want to write interrupt handlers in a high-level language, then a little adapter written in assembly language is needed. It saves the rest of the state including the register contents (or at least those registers that are not preserved according to the calling convention), then calls the subroutine; and when the subroutine returns, it uses an rti instruction to return from the interrupt. Of course, it is possible for the machine-language shim that saves and restores the processor state to be generated by a compiler, and on some machines the compiler lets you mark a C function specially, so that it becomes an interrupt handler rather than an ordinary function. The simpler kind of interrupt mechanism has some advantages: simpler hardware, for one, and the possibility of hand-crafting an interrupt handler in assembly language that runs more quickly by saving and restoring only part of the state.

Heart revisited

Heart versions: 0: Delay loops. 1: Timer-based delay. 2: Interrupt driven. 3: Operating system process.

Our first heart program used a timing loop. We can improve it in two ways:

  1. Use one of the chip's timers to generate the delay. We can arrange for the timer to generate an interrupt at regular intervals, then count the interrupts and implement delay by pausing in a loop, waiting for the count to reach a desired value.
  2. Make the display update itself part of the interrupt handler for the timer. This would allow the main program to be dedicated to another task.

The timer hardware is typical of what is provided in microcontrollers: there's a prescaler that we can use to divide the 16MHz system clock by 24, then a counter that counts up to a predefined value and then generates an interrrupt. On the Nordic chip, we can configure a 'shortcut' to reset the counter when the interrupt happens: this is a good idea, because it removes interrupt latency from the timer period. We can set the limit to 1000 for one interrupt each millisecond.

Block diagram for counter/timer

This interrupt handler increments the global variable millis every millisecond.

unsigned volatile millis = 0;

void timer1_handler(void) {
    if (TIMER1_COMPARE[0]) {
        millis++;
        TIMER1_COMPARE[0] = 0;
    }
}

Now we can re-implement delay like this.

void delay(unsigned usec) {
    unsigned goal = millis + usec/1000;
    while (millis < goal) {
        pause();
    }
}

Most of the time, the processor will be halted in the pause() routine, and it will wake only 5 times for a 5 millisecond delay. (The 32-bit counter millis will overflow after 232 milliseconds or 49.7 days, but Valentine's Day will be over by then.)

This implements idea number 1, but leaves us with a main program that contains a loop for updating the display.

Idea number 2 is to do all the work in the interrupt handler, so that the main program does nothing:

while (1) pause();

or better still, can get on with some background task like finding primes. To do this, we must arrange that updating the display is a subroutine that can be called periodically. If we accept a static (but still multiplexed) display, this is easy enough. We can make the current row into a global variable,

static int row = 0;

Then we'll adjust the timer code so that it generates one interrupt every 5 msec (by setting the limit register to 5000), and arrange for the interrupt handler to call a function advance(), defined like this:

void advance(void) {
    row++;
    if (row == 3) row = 0;
    GPIO_OUT = heart[row];
}

It's also possible to generate an animated display with similar technques (see the Problem Sheet), but this style of programming obviously has its limitations. The big disadvantage remains – there is only one main program, so only one task can have internal control structure, and all the others must be expressed as subroutines that are called at appropriate times, with their entire state stored in variables between one invocation and the next.

Questions

Can an interrupt handler itself be interrupted?

The interrupt controller (NVIC) provides a priority scheme, so that the handler for a low-priority interrupt can be interrupted if a request for a higher-priority interrupt comes in: these 'nested' interrupts are the N in NVIC. But we shall not use this scheme, instead giving all interrupts the same priority. In micro:bian, interrupt handlers will be very short, and just send a message to an attached driver process, so there is no need for nested interrupts. But whether nesting is allowed or not, it's clear that an interrupt handler cannot be interrupted by a second request for the same interrupt, because no interrupt can have a priority that is higher than itself.

How can the processor work out which interrupt is active when interrupts are nested, or one interrupt follows another?

The first answer is that the interrupt handler is called via the appropriate vector, and each interrupt can have a different handler. The linker script and startup code collaborate so that a routine in our program named, e.g., uart_handler will be installed in the correct slot in the vector table, and when it is invoked we can be sure that the UART has requested an interrupt: then we need only determine which reason caused the interrupt among the reasons that are possible for the UART.

If there is no specific interrupt handler for a device, then the startup code fills the vector with the routine defualt_handler. That routine is implemented in micro:bian so that it finds the device driver process (if any) associated with the interrupt and sends it a message, or otherwise shows the Seven Stars of Death. To find out which interrupt has happened, it uses active_irq(), a macro defined in hardware.h, to access the appropriate field in the Interrupt Control and Status Register, part of the System Control Block.

/* active_irq -- find active interrupt: returns -16 to 31 */
#define active_irq()  (GET_FIELD(SCB_ICSR, SCB_ICSR_VECTACTIVE) - 16)

It looks complicated, but just amounts to the fact that there's a location in the address space of the machine that always contains the number of the active interrupt. The hardware at this point numbers internal exceptions from 0 to 15, and external interrupts from 16 up, so it's convenient the subtract 16 to get an interrupt number that starts from 0; the micro:bian interrupt routine then uses this as an index into its table of device driver processes. The constants used in this macro are defined in hardware.h with numeric values gleaned from the hardware manual. It's guaranteed that this field in the ICSR is accurate even if interrupts are nested, and even if one interrupt follows close on the heels of another.


Lecture 12


  1. We should mention somewhere that the proper response to an exception such as HardFault in an embedded system is to enter a mode where the system tries, with the mimimum of assumptions, to bring things to a safe state.