Scheduling – an experiment: Difference between revisions

Copyright © 2017–2023 J. M. Spivey
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
Problem 4.4 (or problem 5a.1 this year) is about a program containing two user processes, plus the serial driver.
<i>This page reports an experiment done with {{Phos}}, the predecessor of {{microbian}}; the conventions differ a little (for example, when starting a process, we choose the PID, rather than having @start()@ decide and return it.  More importantly, in {{microbian}} the serial device driver has an improved interface that allows @printf()@ to buffer characters internally, then pass them to the serial driver all at once, reducing the number of context switches.  This makes the results different numerically, and also changes the sequence of events.</i>
 
Problem 4.4 is about a program containing two user processes, plus the serial driver.
<pre>
<pre>
#include "phos.h"
#include "phos.h"
Line 39: Line 41:
To answer these questions with confidence, I set up an experiment where I instrumented the operating system itself to flash different LEDs as it selected different processes to run, and then I connected an oscilloscope to monitor scheduling decisions as they were taken.  A different approach might have been to have the scheduler record its decisions in a memory buffer for printing at the end of the program.  It would certainly not have worked to have the operating system print its decisions as it went along, since (as we'll see) the printing mechanism is very much part of the phenomenon we're observing.  Although an oscilloscope may be dispensable, it is a satisfyingly phyisical way of visualising what is happening.
To answer these questions with confidence, I set up an experiment where I instrumented the operating system itself to flash different LEDs as it selected different processes to run, and then I connected an oscilloscope to monitor scheduling decisions as they were taken.  A different approach might have been to have the scheduler record its decisions in a memory buffer for printing at the end of the program.  It would certainly not have worked to have the operating system print its decisions as it went along, since (as we'll see) the printing mechanism is very much part of the phenomenon we're observing.  Although an oscilloscope may be dispensable, it is a satisfyingly phyisical way of visualising what is happening.


As well as adding the instrumentation code, I also adjusted the program in a couple of other ways.  First, I corrected what is arguably a flaw in the operating system, in that after the transfer of a message from a sending to a receiving process, ''both'' processes should be put on the queue as ready to run, and the operating system's scheduler is then run to pick the process that runs next.  This provides a uniform mechanism by which we can ensure that a process with higher priority will always run when possible; this does not much affect the results of this experiment, but it simpifies a bit the explanation that appears below.  The other change I made was to expand the buffer of the serial driver process to 256 bytes, more than enough to contain the entire output of the program at once, so that we can be sure that the program does not block waiting for space in the buffer to be free.
As well as adding the instrumentation code, I also adjusted the program by expanding the buffer of the serial driver process to 256 bytes, more than enough to contain the entire output of the program at once, so that we can be sure that the program does not block waiting for space in the buffer to be free.


An initial scope trace of the program running shows that the run has three phases.  The horizontal scale here is 20ms per division.
An initial scope trace of the program running shows that the run has three phases.  The horizontal scale here is 20ms per division.
Line 79: Line 81:


When phase 3 is over, UART becomes idle, the serial driver blocks waiting for a message that will never arrive, processes 1 and 2 have both exited, and the machine goes to sleep, never to wake up until we push reset to start another run.
When phase 3 is over, UART becomes idle, the serial driver blocks waiting for a message that will never arrive, processes 1 and 2 have both exited, and the machine goes to sleep, never to wake up until we push reset to start another run.
What happens if we reverse the change I made to the operating system, so that after sending a message, a process continues to run rather than joining the queue to run again?  In that case, the values that are printed increase by about 8,000 at a time instead of 16,000, and we should be able to work out why.  Consider the following sequence of events, beginning as before with process 2 running and process 1 sitting in the ready queue.
* A UART interrupt arrives, and the serial driver starts to run.  I sends a fresh character to the UART, and then blocks at @receive@.
* Process 1 is at the head of the run queue and starts to run.  When it reaches a @send@ to the serial driver, the driver process becomes ready, but process 1 continues to run.  The process blocks again when it wants to send a second message to the serial driver.
* Now the serial driver is ready again: it has received a message from process 1, and it stores the character in its buffer and calls @receive@ again.  When it does so, it finds that process 1 is already waiting to send a message, so it receives the message, making process 1 ready again, and stores a second character in its buffer.
* When the serial driver reaches @receive@ again, it blocks, and process 2 runs.  Process 1 is left on the queue, ready to be scheduled after the next interrupt.
The upshot of this sequence of events is that ''two'' characters are transferred from process 1 to the serial buffer for each interrupt.  Consequently, only half the number of interrupts must come before process 1 takes another observation of the value of @r@, and process 2 has time to increment it only half as much as in the other setup.

Revision as of 20:10, 8 March 2022

This page reports an experiment done with Phōs, the predecessor of micro:bian; the conventions differ a little (for example, when starting a process, we choose the PID, rather than having start() decide and return it. More importantly, in micro:bian the serial device driver has an improved interface that allows printf() to buffer characters internally, then pass them to the serial driver all at once, reducing the number of context switches. This makes the results different numerically, and also changes the sequence of events.

Problem 4.4 is about a program containing two user processes, plus the serial driver.

#include "phos.h"

static volatile int r = 0;

void proc1(int n) {
    for (int i = 0; i < 10; i++)
        serial_printf("r = %d\n", r);
}

void proc2(int n) {
    while (r < 1000000) r++;
}

void init(void) {
    serial_init();
    start(USER+0, "Proc1", proc1, 0, STACK);
    start(USER+1, "Proc2", proc2, 0, STACK);
}

Running this program with Process 2 scheduled first gives an easily understood result: the process increments r till it reaches 100,000 before Process 1 gets to run at all, and all the values that are printed are equal to 100,000.

If (as in the program shown) Process 1 gets to run first, the results are more interesting:

r = 0                                                                           
r = 9718                                                                        
r = 23547                                                                       
r = 38736                                                                       
r = 53920                                                                       
r = 69099                                                                       
r = 84278                                                                       
r = 99457                                                                       
r = 100000                                                                      
r = 100000                                                                      

Various questions arise: why do the printed values go up in fairly regular jumps? And why are the increments not exactly the same? It's obvious that Process 2 gets to run and increase the value of r, but once it starts, why doesn't it run to completion?

To answer these questions with confidence, I set up an experiment where I instrumented the operating system itself to flash different LEDs as it selected different processes to run, and then I connected an oscilloscope to monitor scheduling decisions as they were taken. A different approach might have been to have the scheduler record its decisions in a memory buffer for printing at the end of the program. It would certainly not have worked to have the operating system print its decisions as it went along, since (as we'll see) the printing mechanism is very much part of the phenomenon we're observing. Although an oscilloscope may be dispensable, it is a satisfyingly phyisical way of visualising what is happening.

As well as adding the instrumentation code, I also adjusted the program by expanding the buffer of the serial driver process to 256 bytes, more than enough to contain the entire output of the program at once, so that we can be sure that the program does not block waiting for space in the buffer to be free.

An initial scope trace of the program running shows that the run has three phases. The horizontal scale here is 20ms per division.

Overview of a run
Overview of a run

In this screenshot, the top trace (yellow) shows the serial driver process, the middle two traces show process 1 (green) that is capturing and printing values of r and process 2 (blue) that is incrementing r. The bottom trace (red) shows activity of the UART interrupt. The three phases are as follows:

  • Initially, all three processes and the interrupt are all active.
  • After about 75ms, process 2 has succeeded in increasing r to 100,000, and it terminates. There is then a brief but intense flurry of activity involving process 1 and the serial driver.
  • After process 1 has terminated, the serial driver and the interrupt continue to be active for a further 20ms or so.

Let's begin by zooming in on the first phase, which is where the counting happens. From the initial scope trace, it's plain that there are more than 10 and less that 100,000 relevant events. Here's an interval containing two bursts of activity and the interval between them, with a horizontal scale of 200μs per division.

Interval between bursts
Interval between bursts

Each trace is low when the corresponding process is active; the interrupt trace is high when nothing is happening, and low from the time an interrupt is received until the connected process (in this case, the serial driver process) starts to run. Interrupts arrive at intervals of slightly more than 1ms, and that is right for a serial line running at 9600 baud. As you can see, in the interval between interrupts, process 2 runs constantly; it gets to run about 90% of the time, and so it makes sense that the first phase of the program takes about 75ms. If the loop to increment r takes 10 cycles per iteration, then the whole process will take about 1,000,000 cycles or 63ms, consistent with the elapsed time. That also means that, between interrupts, process 2 has time to increase r by about 1,600.

The running of process 2 is suspended when an interrupt arrives. Let's zoom in further on the activity that then results, involving the device driver (yellow) and process 1 (green). The horizontal scale is 10μs per division.

A burst of activity
A burst of activity

Here we see that the interrupt causes the connected process, the serial driver, to run. It was blocked at a call of receive, and it now runs, handles the interrupt by starting to transmit the next character, and blocks as receive again. The operating system now chooses another process to run. As it happens, process 1 was ready before the interrupt arrived (why this is so will be explained in a moment), and process 2 has joined the queue behind it – so process 1 is chosen. It runs very briefly until it reaches a call of send to pass a character to the serial driver. Since the serial driver is waiting, the message is tranferred immediately, and both processes are then ready to run. With its higher priority, the serial driver is chosen, and process 1 joins the ordinary queue behind process 2. The serial driver stores the character in its buffer, then blocks at receive. Notice at this point that process 1 will need to run again before it is ready to send the next character, and the serial driver cannot continue until it does so. Next, process 2 is chosen to run, and it continues to run until the next interrupt, with process 1 ready and in the queue, thus establishing the conditions for the same behaviour to occur at the next interrupt.

Process 2 increases r by about 1600 between one interrupt and the next, and it takes about 10 interrupts for process 1 to print a line of output and make its next observation of r, so it makes sense that r increases by about 16,000 between one observation and the next. There may be many explanatory factors behind the irregularity in the increments, but one of them is that the printed lines vary in length, and while one line is being printed, process 2 is incrementing r to the value that will be shown in the next line.

Though I enlarged the buffer in the serial driver process to ensure it would not become full, the analysis just given shows that I needn't have bothered: process 1 is able to put only one character into the buffer for each character taken out, so the buffer will, during this phase of the program, never contain more than one or two characters.

In phase 2, process 1 is printing the final few lines (with r = 100,000). Here is a magnified view of this part of the run, at a horizontal scale of 200μs per division.

Phase 2 of the run
Phase 2 of the run

With process 2 no longer soaking up processor time, process 1 is able to send its characters to the serial driver more rapidly: about 50μs per character. You can see clearly the pause when the process is converting a number to decimal (using multiple slow divisions) before printing it. Toward the right of the picture, you can see an event where an interrupt arrives while the connected process is already running: the interrupt is saved and delivered as soon as the process is ready for it, and the process runs for about twice the normal time as it deals with the interrupt.

In phase 3, process 1 has exited, and the processor is idle between interrupts, as the serial buffer drains. Horizontal scale of 200μs per division again

Phase 3 of the run
Phase 3 of the run

The time between the operating system noticing an interrupt and its selecting the connected process to run is noticeably shorter here, because the OS doesn't need to put a currently running process into the queue. This time doesn't reflect the true interrupt latency of the hardware and operating system, because is doesn't include the time take to save context for the interrupt, nor the time taken to restore the context of the connected process. Measuring such things requires a different experiment.

When phase 3 is over, UART becomes idle, the serial driver blocks waiting for a message that will never arrive, processes 1 and 2 have both exited, and the machine goes to sleep, never to wake up until we push reset to start another run.