Note: I've just migrated to a different physical server to run Spivey's Corner,
with a new architecture, a new operating system, a new version of PHP, and an updated version of MediaWiki.
Please let me know if anything needs adjustment! – Mike

Lecture 14 – Context switching (Digital Systems)

Copyright © 2024 J. M. Spivey
Jump to navigation Jump to search

To implement an operating system like micro:bian, a vital ingredient is a mechanism that switches the processor from running one client process to running another. Such a context switch might happen, for example, when one process sends a message to another. The operating system might decide to suspend the sending process and start to run the receiving process in its place. Or a context switch might happen in response to an interrupt: the processor might switch from running some background process to running the device driver to which it has delivered an INTERRUPT message. This association with interrupts gives a clue to how context switching can be implemented in general.

With the help of the "supervisor call" instruction svc, we can arrange that every entry to the operating system is via an interrupt; the svc instruction forces a special kind of interrupt to occur between it and the next instruction. When an interrupt happens, whether caused by a hardware device or by svc, some of the processor state is saved on the stack; we will extend this with a small fragment of assembly language that saves the remaining state – register r4--r7 and (in case anyone uses them) r8--r11. Just knowing the sp value at this point is enough information to restart the process, first with a little assembly language fragment that restores register r4--r11, then by means of the normal interrupt return mechanism that restores the remaining state and runs the process again.

The new ingredient compared with an ordinary interrupt is that between entry to the operating system and exit back to client processes, we will change the stack pointer. The operating system will keep a table (details later) that contains much information about each process, but in particular contains the saved stack pointer of each process that is not running. When a context switch happens, the operating system saves the sp value for the process that is suspending, and retrieves the sp value of the process that is resuming. The interrupt return mechanism then activates the new process.

Implementing this is a bit fiddly – the kind of thing that causes great rejoicing when it actually works – but the task is helped by the fact that the processor has a second stack pointer, so that (after a little configuration) there's one stack pointer psp for use by a running process, and another msp for use by the operating system. That means (a) that the operating system has its own stack, and doesn't need to steal stack space from either process; and (b) that we can use subroutines in the context switch code without worrying about messing up the process stacks.

Context switch in pictures

In detail, the steps are as follows. 1. The operating system is invoked via an svc instruction or ordinary interrupt.

2. The hardware saves some of its state on the process stack as before; but it is configured to switch to using the main stack pointer msp before entering the handler for svc or the interrupt. In what follows, we concentrate on a system call mediated by svc.

Context switch – part 1

3. The handler for the svc interrupt saves the remaining processor state on the stack belonging to the process.

4. The operating system is entered at system_call, using the main stack in place of the process stack.

Context switch – part 2

5. The operating system may have internal subroutines whose stack frames occupy space on the main stack.

Context switch – part 3

6. When the operating system returns, it provides a value for the process stack pointer that may be different from the previous value, so that it is a different process that resumes.

Context switch – part 4

7. The additional saved state is from the new process is restored, and the system call handler triggers a return-from-interrupt event.

Context switch – part 5

8. The hardware restores the parts of the state it saved earlier, but with values that belong to the new process.

9. The new process starts to run.

Context switch – part 6

Context switch in code

Now let's tell the same story again, but with fragments of code. The story begins with a system call stub, a function that can be called from the body of a process and implements a kernel call such as yield or send by gathering the arguments and issuing a supervisor call instruction svc, with an integer argument that indicates which system call it is. The macro syscall turns into a single svc instruction.

void NOINLINE yield(void)
{
    syscall(SYS_YIELD);
}

void NOINLINE send(int dest, int type, message *msg)
{
    syscall(SYS_SEND);
}

These routine are marked NOINLINE to disable the compiler optimisation that substitutes the body of a (non-recursive) subroutine for the text of its call – that way, we can be sure that the arguments of send appear in registers r0 to r2, and the operating system can then discover them by reaching into the stack of the process.

Here is the code that implements the interrupt handler for the svc interrupt (from mpx-m0.s):

svc_handler:
    isave                   @ Complete saving of state; r0 = stack pointer
    bl system_call          @ Perform system call; r0 = system_call(r0)
    irestore                @ Restore saved state from r0

I've implemented isave and irestore as assembly-language macros whose expansion will be inserted textually wherever they appear; they could be made into subroutines to save code space, but that would be slower, and these would be the only subroutines in the course that did not obey the ARM calling conventions.

The helper routines isave and ireturn implement the saving of that part of the state that is not saved by the hardware on interrupt. In the interests of full disclosure, here is code for isave:

@@@ isave -- save context for system call
    .macro isave
    mrs r0, psp                 @ Get thread stack pointer
    subs r0, #36
    movs r1, r0
    mov r3, lr                  @ Preserve magic value 0xfffffffd
    stm r1!, {r3-r7}            @ Save low regs on thread stack
    mov r4, r8                  @ Copy from high to low
    mov r5, r9
    mov r6, r10
    mov r7, r11
    stm r1!, {r4-r7}            @ Save high regs in thread stack
    .endm                       @ Return new thread sp

In this code, the first instruction fetches the process stack pointer into r0: the mrs instruction (Move to Register from Special) can be used to access multiple special registers inside the processor, and has a counterpart called msr that we'll use later. Also new here is the stm instruction that (like push) stores multiple registers in consecutive words of memory. Unlike push, this can use an arbitrary register for the address where the values are stored, and it modifies that register (that's what the ! means) by the size of the data saved, in this case 16 bytes. Annoyingly, however, unlike push it increments the address rather than decrementing it, and that's the reason for the three instructions that adjust r0 by multiples of 16. Also a bit irritating is that fact stm cannot store from the high registers, so we must laboriously move r8--r11 into r4--r7 before saving them. We are really earning our money here.

You can guess the implementation of irestore: it is a bit easier because the ldm instruction that's dual to stm does go in the direction we want, and it ends with msr psp, r0 to set the process stack pointer.

And here is the layout of the frame that is pushed by hardware and software onto the stack of the suspended process:

--------------------------------------
16  PSR  Status register
15  PC   Program counter
14  LR   Link register
13  R12
12  R3
11  R2           (Saved by hardware)
10  R1
 9  R0   Process argument
--------------------------------------
 8  R11  
 7  R10
 6  R9
 5  R8           (Saved manually)
 4  R7 
 3  R6 
 2  R5
 1  R4
 0  MAGIC
--------------------------------------

It's right to save this information on the process stack, because the register values are specific to the process, and will be needed again precisely when the process is resumed. The address at which r8 has been saved is the one that is recorded as the stack pointer of the suspended process.

The function system_call is written in C, and is the entry point of the operating system for system calls written with svc. Its heading is

unsigned *system_call(unsigned *sp),

showing that it is passed as a parameter the process stack pointer of the process that is being suspended, and returns as a result the stack pointer for the process to be resumed. As we'll see later, the system_call function can decode the state of the suspended process to find out what system call (such as yield(), send(), receive()) was requested, and what its parameters were.

Note: Update needed for changed micro:bian API.

Here's an outline of the implementation of system_call.

#define arg(i, t) ((t) psp[R0_SAVE+(i)])

/* system_call -- entry from system call traps */
unsigned *system_call(unsigned *psp)
{
    short *pc = (short *) psp[PC_SAVE]; /* Program counter */
    int op = pc[-1] & 0xff;      /* Syscall number from SVC instruction */

    /* Save sp of the current process */
    os_current->sp = psp;

    switch (op) {
    case SYS_YIELD:
        make_ready(os_current);
        choose_proc();
        break;

    case SYS_SEND:
        mini_send(arg(0, int), arg(1, int), arg(2, message *));
        break;

        ...
    }

    /* Return sp for next process to run */
    return os_current->sp;
}

This implementation finds the program counter value for the suspended process by reaching into the saved state on its stack; later it will find the system call arguments in a similar way. Then it looks at the svc instruction that invoked the operating system (at offset −2 bytes from the PC) to find out what system call this is. The stack pointer for the process just suspended it saved in the process table entry os_current. Then a switch statement selects an action appropriate to the system call: switching to a different process, sending a message, and so on. As part of this action, the function choose_proc() may be used to select a different process as the value of os_current: this happens directly in the case of the yield() system call, and may happen indirectly with send() or recieve(), particularly if the current process becomes blocked. The system_call() process returns the SP value of this new process to svc_handler in r0, and it's this SP value that is used to resume the process that is now selected as current.

It's up to the operating system to decide (as a matter of policy) which process should get to run next. What we're concentrating on for the moment is the mechanism by which that policy is put into effect. If the operating system wants the currently executing process to carry on, it can simply return the same stack pointer it received, and the context switching mechanism will then return to the same process that it suspended when the call was made.

The explanation above captures well enough what happens once a program is running, but any explanation will be unsatisfying if it doesn't reveal what happens when the program is starting up. There are two aspects to this: how each process starts, and how the entire system starts.

The operating system resumes each process by the return-from-interrupt mechanism, and that applies also to the time when the process first starts up. To provide for this, when the process is set up it is given a fake exception frame on its stack. This is done in the function start in microbian.c, which depends on the frame layout shown above. It sets up the fake frame so that

  • r0 contains the integer argument that will be passed to the process.
  • pc contains the address of the function body. The LSB should not be set, or the result is UNPREDICTABLE (ARM manual, page B1-201), though in practice this causes no problems.
  • The value of psr doesn't matter much, but it should have the bit set that indicates Thumb mode. (There's a great sense of relief when you finally get such details right.)
  • The value of lr determines what happens if the process body should ever return. By setting it to the address of exit, we arrange for the process to terminate cleanly in this case.
  • Other registers can have arbitrary values, so it's safe to leave them as zero.

These values, saved on the initial stack for the process, ensure that when the process is first activated by the return-from-interrupt mechanism, it starts to run the process body with the supplied argument.

We also want to know how the whole system starts. The first process to run is a special process IDLE, belonging to the operating system itself, that will later become the process that runs when there is no other process ready to run. It contains an infinite loop with the only wfe instruction in the whole system. The idle process is created by a function __start() that is called very early in the start-up sequence.

/* __start -- start the operating system */
void __start(void)
{
    /* Create idle task as process 0 */
    idle_proc = create_proc("IDLE", IDLE_STACK);
    idle_proc->state = IDLING;
    idle_proc->priority = P_IDLE;

    /* Call the application's setup function */
    init();

    /* The main program morphs into the idle process. */
    os_current = idle_proc;
    set_stack(os_current->sp);
    yield();

    idle_task();
}

void idle_task(void)
{
    /* This process only runs again when there's nothing to do. */
    while (1) pause();
}

The function __start() first creates the idle process, then calls the user-supplied function init() to create the other processes in the system. After this, the processor switches (via set_stack() into the mode where there are separate stacks for operating system and for processes, and then calls idle_task(), the body of the idle process. A call to yield lets the operating system choose other, genuine, processes to run, and after that the idle process runs only when there are no other processes ready. It falls into an infinite loop that repeatedly calls pause() to wait for the next interrupt.

The helper routine setstack() uses a couple of special instructions to do its job: msr allows values to be moved into special registers in the processor like psp. Setting the control register to 2 enables the use of psp as the stack pointer, and the isb instruction is there to ensure that no instructions in the pipeline use the old stack pointer.

setstack:
    msr psp, r0             @ Set up the stack
    movs r0, #2             @ Use psp for stack pointer
    msr control, r0
    isb                     @ Drain the pipeline
    bx lr

Summary: By saving the whole state of a process on its stack, we can implement context switches from a process to the OS kernel and then to a different process. There's always some system-dependent detail to this, and a bit of assembly-language cleverness, but in outline it is always similar.

This lecture covers the mechanism that supports multiple, interleaved processes; we have not yet touched on the separate policy that determines which process should run at each time.

Context

More than any other code shown in this course, the context switch code shown here is architecture-dependent. Every machine capable of supporting multitasking makes it possible to save the machine state in this way, and restore it in order to revive a suspended process. Using the interrupt mechanism uniformly to enter the operating system is a very common approach. Context switch for a simple machine like a microcontroller is itself quite simple. But for a more complex machine with an MMU, it is a more far-reaching operation. Each process has its own mapping from a virtual address space to physical addresses in the RAM that makes it appear to the process that it alon occupies the memory of the machine, and prevents any process from interfering with the memory occupied by others. When control is transferred from one process to another, or from a process to the operating system, this mapping must also be updated. Simpler machines may require the memory occupied by each process to occupy a contiguous region in the RAM, but more sophisticated machines divide the memory into pages that may be arbitrarily arranged to form the memory belonging to a process, and some of them may be stored not on RAM but on disk, to be retrieved by the operating system when the process needs access to them. All this adds up to quite a lot of state that must be saved and restored at a context switch.

A refinement

The above account describes a feasible scheme for implementing interrupts and context switch, but as it is currently implemented, micro:bian does something a little more refined. System calls do indeed cause a full context switch as described, with hardware and software collaborating to save the whole process state. On the other hand, interrupts do only a partial context switch, using the hardware supported mechanism to save some state and call an interrupt handler subroutine. If that subroutine – part of the operating system – decides that a full context switch is needed, then it uses the machine's mechanism for a "pending supervisor call" to request one. When this PendSV feature is activated, a special kind of interrupt is generated, and the operating system provides a handler for this interrupt that saves the complete state and performs the desired context switch.

The hardware optimises things so that this two-stage process for handling an interrupt that generates a message is not much slower overall than the simpler scheme of saving all the state immediately. The chief advantage of this more refined scheme is that it is possible to install a specially-written interrupt handler for a particularly time-critical interrupt and have it get control more quickly than under the simpler scheme. In the case where no message need be sent, the whole process of handling an interrupt can be made much faster in this way.

Lecture 15