Lecture 11 – The interrupt mechanism (Digital Systems)

From Spivey's Corner
Jump to: navigation, search

Interrupts behave (from the programmer's point of view) as subroutine calls that are inserted between the instructions of a program, in response to hardware events. So that these subroutine calls do not disrupt the normal functioning of the program, it's important that the entire state of the CPU is preserved by the call. This means not only the data registers, but also the address stored in the link register, and the state of the flags. For example, if a leaf routine is interrupted and has its return address stored in the link register, then that return must be preserved for use when the interrupted routine returns. And if an interrupt comes between the two instructions in a typical compare-and-branch sequence:

cmp r0, r1
beq label

then the flags set by the comparison must be preserved by the interrupt so they can be used to determine whether the branch is taken.

On the Cortex-M0, this is done in such a way that the interrupt handler can be an ordinary subroutine, compiled with the same conventions as any other. That means the compiler that translates each subroutine need not know which ones will be installed as interrupt handlers, and also that no assembly language coding is needed to install and interrupt handler. When an interrupt arrives:

  • the processor completes (or, rarely, abandons) the instruction it is executing.
  • some of the processor state is saved on the stack. The saved state consists of the program counter, the processor status register (containing the flags), the link register, and registers r0, r1, r2, r3 and r12.
  • The link register lr is set to a magic value (not a valid address) that will be recognised later.
  • The program counter pc is loaded with the address of the interrupt handler. Each device that can interrupt (up to 48 different ones) has a number, and that number is used as an index into a table of interrupt vectors that (on the Nordic chip) is located in ROM at address 0.

The state is now as shown in the middle diagram below, which shows what happens when a subroutine P receives an interrupt with handler H.

Stack states during interrupt entry

The interrupt handler must obey the normal calling conventions of the machine: if it calls other subroutines (as it may) or uses registers r4-r7 (or, less likely, r8--r11) then it must save lr with its magic value and these registers in its stack frame. So, working together, the hardware and the procedure's prologue save all of r0-r12, plus the pc and lr, and things are arranged so that when the handler returns, it will use the magic value as its return address.

The interrupt handler is now running in a context that is equivalent to the one it would see if it had been called as an ordinary subroutine. It can make free use of registers r0--r3, and can use registers r4--r7 if it has saved them. It can call other subroutines that obey the same calling conventions, and their stack frames will occupy space below its frame and the exception frame on the stack.

When the handler returns, it will restore the values of r4--r11 to the values they had before the interrupt, then branch to the magic value. This signals the processor, instead of loading this value into the pc, to restore the values that were saved by the interrupt mecahnism, and the processor returns to the interrupted subroutine. Global variables may have been changed by the interrupt handler – for example, a character may have been received by the UART and put in a buffer – but all the local state is just as it was before the interrupt handler was invoked.

In each program, the table of interrupt handlers is put together by the linker, guided by the linker script NRF51822.ld, which specifies they should go at address 0. The code in startup.c makes each vector point to a default handler that flashes the Seven Stars of Death, unless a subroutine with an appropriate name such as uart_handler is defined somewhere in the program.

In the Cortex-M0, the management of interrupts is delegated to a semi-detached functional unit called the Nested Vectored Interrupt Controller (NVIC). We need to know about this, because to use interrupts, three separate units must be set up properly: the device, so that it requests an interrupt when certain events happen; the NVIC, so that it enables and gives an appropriate priority to interrupts from the device, and the CPU itself, so that it responds to interrupt requests. The name of the NVIC gives a clue to some aspects of its operation:

  • It supports nested interrupts – that is, each device is given a configurable priority, and during the handling of an interrupt from a low-priority device, interrupts from higher-priority devices may happen. We will not use this, but will soon move to an operating system where most interrupt handlers are very short and just convert the interrupt into a message: this removes the worry that handling one interrupt will block another one for too long. Note that, in any case, a device cannot have higher priority than itself, so the interrupt handler for a device completes before another interrupt from the device is accepted.
  • This (delaying interrupts until the CPU is ready for them) works because the NVIC keeps track of which devices have pending interrupts, and will send requests to the CPU when it can handle them. There is just one pending interrupt for each device, so if we want (for example) to count interrupts as clock ticks, then we must be sure to service each interrupt before the next one arrives.
  • Interrupts are vectored – that is, each device can have a separate interrupt handler, rather than (say) having one handler for all interrupts that must then work out which device(s) sent the request. Note, however, that a device can generate interrupts for multiple reasons, so the handler must then work out why the interrupt happened, and respond appropriately. For example, the UART can interrupt when it is ready to output a fresh character, but also when it has received a character that it is ready to pass the the CPU.
  • Interrupts can be disabled in several ways: all interrupts can be disabled and enabled in the CPU by using the functions intr_disable() and intr_enable() that are aliases for the instructions cpsie i and cpsid i. We can also disable interrupts from a specific device by using the functions disable_irq(irq) and enable_irq(irq). Interrupts for specific events can also be enabled and disabled by using the device registers for a specific device.
  • There are a couple of system-level interrupts (also called exceptions) that are not associated with any particular device. One of these is the HardFault exception that is triggered when a program attempts an undefined action, such as an unaligned load or store. In our software, the handler for the HardFault exception shows the seven stars of death.[1]
  • Another kind of system event is triggered when the processor executes and svc (supervisor call) instruction. When we introduce an operating system, this will mean that there is a uniform way of entering the operating system, whether it is entered because of a hardware event or a software request. On machines (the Nordic chip isn't one of them) where application programs run with less privileges than the operating system, the svc instruction is also the means for crossing the boundary between unprivileged and privileged programs.

Context

The interrupt mechanism of the Cortex-M0 is unusual in obeying its own calling conventions: that is to say, the actions on interrupt call and return exactly match the conventions assumed by compilers for the machine. This makes it possible for interrupt handlers to be subroutines written in a high-level language and compiled in the ordinary way. It's more normal for machines to save only a minimal amount of state when an interrupt happens: perhaps just the program counter and processor status word are pushed on a stack, and the program counter is loaded with the handler address. When the interrupt handler wants to return, it must use a special instruction rti that restores the program counter and status word from the stack, rather than the usual instruction for returning from a subroutine. If we want to write interrupt handlers in a high-level language, then a little adapter written in assembly language is needed. It saves the rest of the state including the register contents (or at least those registers that are not preserved according to the calling convention), then calls the subroutine; and when the subroutine returns, it uses an rti instruction to return from the interrupt. Of course, it is possible for the machine-language shim that saves and restores the processor state to be generated by a compiler, and on some machines the compiler lets you mark a C function specially, so that it becomes an interrupt handler rather than an ordinary function. The simpler kind of interrupt mechanism has some advantages: simpler hardware, for one, and the possibility of hand-crafting an interrupt handler in assembly language that runs more quickly by saving and restoring only part of the state.

Heart revisited[edit]

Heart versions: 0: Delay loops. 1: Timer-based delay. 2: Interrupt driven. 3: Operating system process.

Our first heart program used a timing loop. We can improve it in two ways:

  1. Use one of the chip's timers to generate the delay. We can arrange for the timer to generate an interrupt at regular intervals, then count the interrupts and implement delay by pausing in a loop, waiting for the count to reach a desired value.
  2. Make the display update itself part of the interrupt handler for the timer. This would allow the main program to be dedicated to another task.

The timer hardware is typical of what is provided in microcontrollers: there's a prescaler that we can use to divide the 16MHz system clock by 24, then a counter that counts up to a predefined value and then generates an interrrupt. On the Nordic chip, we can configure a 'shortcut' to reset the counter when the interrupt happens: this is a good idea, because it removes interrupt latency from the timer period. We can set the limit to 1000 for one interrupt each millisecond.

Block diagram for counter/timer

This interrupt handler increments the global variable millis every millisecond.

unsigned volatile millis = 0;

void timer1_handler(void) {
    if (TIMER1_COMPARE[0]) {
        millis++;
        TIMER1_COMPARE[0] = 0;
    }
}

Now we can re-implement delay like this.

void delay(unsigned usec) {
    unsigned goal = millis + usec/1000;
    while (millis < goal) {
        pause();
    }
}

Most of the time, the processor will be halted in the pause() routine, and it will wake only 5 times for a 5 millisecond delay. (The 32-bit counter millis will overflow after 232 milliseconds or 49.7 days, but Valentine's Day will be over by then.)

This implements idea number 1, but leaves us with a main program that contains a loop for updating the display.

Idea number 2 is to do all the work in the interrupt handler, so that the main program does nothing:

while (1) pause();

or better still, can get on with some background task like finding primes. To do this, we must arrange that updating the display is a subroutine that can be called periodically. If we accept a static (but still multiplexed) display, this is easy enough. We can make the current row into a global variable,

static int row = 0;

Then we'll adjust the timer code so that it generates one interrupt every 5 msec (by setting the limit register to 5000), and arrange for the interrupt handler to call a function advance(), defined like this:

void advance(void) {
    row++;
    if (row == 3) row = 0;
    GPIO_OUT = heart[row];
}

It's also possible to generate an animated display with similar technques (see the Problem Sheet), but this style of programming obviously has its limitations. The big disadvantage remains – there is only one main program, so only one task can have internal control structure, and all the others must be expressed as subroutines that are called at appropriate times, with their entire state stored in variables between one invocation and the next.

Lecture 12


  1. We should mention somewhere that the proper response to an exception such as HardFault in an embedded system is to enter a mode where the system tries, with the mimimum of assumptions, to bring things to a safe state.

On ARM processors, a register (r14) in which the program counter value is saved by the instructions bl and blx that call a subroutine. The subroutine can return by branching to this address with the instruction bx lr, or can save the value on the stack (with push {..., lr}) and later return by restoring the same value back into the program counter (with pop {..., pc}).

A subroutine that uses only a few registers and calls no others. On the Cortex-M0, such a routine can use only registers r0 to r3. The code for a leaf routine need not establish a stack frame or save its return address in memory, but can leave it where it arrives (in register lr) and return directly (using the instruction bx lr). This is an important optimisation, particularly for programs that contain many small subroutines for the sake of data abstraction.

A symbolic representation of the machine code for a program.

A register that contains the address of the next instruction to be executed. Because of pipelining, on ARM Cortex-M machines, reading the program counter yields a value that is 4 bytes greater than the address of the current instruction.

(Read-Only Memory). A form of storage whose contents are non-volatile (are not lost when the power is off) but cannot be changed under program control. Modern ROM is usually EEPROM – Electrically Erasable Programmable Read Only Memory, and can be changed electrically, and even under control of a program running on the microcontroller, but using special peripheral registers and not the normal store instructions. Flash memory is a modern, super-compact implementation of EEPROM, but for our purposes it does exactly the same job. We will modify the contents of the micro:bit's flash memory by downloading programs, but we will probably not be writing programs that change the contents of the flash memory.

(Universal Asynchronous Receiver/Transmitter). A peripheral interface that is able to send and receive characters serially, commonly used in the past for communication between a computer and an attached terminal. It is commonly used in duplex mode, with the transmitter of one device connected to the receiver of the other with one wire, and the receiver of the one connected to transmitter of the other with a different wire. The asynchronous part of the name refers to the fact that the transmitter and receiver on each wire do not share a common clock, but rely instead on the signalling protocol and precise timing to achieve synchronisation.

A text, written in a specialised language, that describes the layout in memory of a program. Compilers typically divide their output into four named sections: text for the program code and embedded constants, data for statically allocated data that has a specified initial value other than zero, rodata for initialised data that is constant, and bss for data that is statically allocated but can initially be filled with zeroes. The linker script may add another section for the program's stack. For microcontrollers, a linker script is needed that puts the text and rodata in Flash, and lays out the RAM so that stack and bss are in separate areas, with the data copied into its own part of RAM from an image held in Flash.

(Nested Vector Interrupt Controller). An ARM processor component that is able to assign priorities to external interrupts (as opposed to those generated by internally by the processor) and control the servicing of interrupts. As the name indicates, it is able to cope with nested interrupts, where servicing of one interrupt is itself interrupted by another with higher priority. Note that, according to ARM conventions, higher priorities are indicated by lower numbers.

(A near-synonym for ABI). The convention that determines where arguments for a subroutine are to be found, and where the result is returned.

(General-Purpose Input/Output). A peripheral interface that provides direct access to pins of the microcontroller chip. Pins may be configured as inputs or outputs, and interrupts may be associated with state changes on certain input pins. On the micro:bit, the LEDs and pushbuttons are connected to GPIO pins.