Scheduling – an experiment: Difference between revisions

Revision as of 13:35, 2 January 2023

Problem 4.4 is about a program containing two user processes, plus the serial driver.

#include "microbian.h"
#include "hardware.h"
#include "lib.h"

static volatile int r = 0;

void proc1(int n) {
    for (int i = 0; i < 10; i++)
        printf("r = %d\n", r);
}

void proc2(int n) {
    while (r < 100000) r++;
}

void init(void) {
    serial_init();
    start("Proc1", proc1, 0, STACK);
    start("Proc2", proc2, 0, STACK);
}

Running this program with Process 2 scheduled first gives an easily understood result: the process increments r till it reaches 100,000 before Process 1 gets to run at all, and all the values that are printed are equal to 100,000.

If (as in the program shown) Process 1 gets to run first, the results are more interesting:

Various questions arise: why do the printed values go up in fairly regular jumps? And why are the increments not exactly the same? It's obvious that Process 2 gets to run and increase the value of r, but once it starts, why doesn't it run to completion?

To answer these questions with confidence, I set up an experiment where I instrumented the operating system itself to flash different LEDs as it selected different processes to run, and then I connected an oscilloscope to monitor scheduling decisions as they were taken. A different approach might have been to have the scheduler record its decisions in a memory buffer for printing at the end of the program. It would certainly not have worked to have the operating system print its decisions as it went along, since (as we'll see) the printing mechanism is very much part of the phenomenon we're observing. Although an oscilloscope may be dispensable, it is a satisfyingly phyisical way of visualising what is happening.

An initial scope trace of the program running shows that the run has three phases. The horizontal scale here is 20ms per division.

Overview of a run

In this screenshot, the top trace (yellow) shows the serial driver process, the middle two traces show process 1 (green) that is capturing and printing values of r and process 2 (blue) that is incrementing r. The bottom trace (red) shows activity of the UART interrupt. Each trace is high when idle, and low when the corresponding process is running; the trace for the interrupt has a brief negatve-going pulse when the interrupt is triggered. The three phases are as follows:

Initially, all three processes and the interrupt are all active.
After about 10ms, process 1 has succeeded in taking its 10 snapshots of the value of r, and it exits.
About 75ms from the start, process 2 finishes incrementing r to 100,000, and it also exits.
After that, the serial driver process continues to respond to interrupts by sending characters from its buffer to the UART for a further 30ms or so.

Let's concentrate in on the first phase, which is where the counting happens. From the initial scope trace, it's plain that there are closer to 10 than 100,000 events in this phase. Here's the start of the run, showing four bursts of activity in the program, with a horizontal scale of 500μs per division.

Initial activity

Each trace is low when the corresponding process is active; the interrupt trace is high when nothing is happening, and briefly low each time an interrupt is received. Interrupts arrive at intervals of slightly more than 1ms, and that is right for a serial line running at 9600 baud. As you can see, in the interval between interrupts, process 2 runs constantly; it gets to run about 90% of the time. If it is constantly incrementing r in a loop that takes about 10 cycles per iteration, we expect that to take about a million cycles or 63ms to reach 10,000, and that is consistent with an elapsed time of 75ms to complete the process. It also means that between interrupts, process 2 has time to increase r by about 1,500.

The running of process 2 is suspended when an interrupt arrives. Let's zoom in further on the activity that then results, involving the device driver (yellow) and process 1 (green). The horizontal scale is 50μs per division.

A burst of activity

Here we see that the interrupt causes the connected process, the serial driver, to run. It was blocked at a call of receive, and it now runs, handles the interrupt by starting to transmit the next character, and blocks as receive again. The operating system now chooses another process to run. As it happens, process 1 was ready before the interrupt arrived (why this is so will be explained in a moment), and process 2 has joined the queue behind it – so process 1 is chosen. It captures a value of r and uses printf to format it as a string, which printf then passes to the serial driver all at once using a PUTBUF message. The serial driver then runs again, copies the characters into its own buffer, and sends a REPLY message that makes the process 1 ready to run again. However, process 1 now joins the queue behind process 2, and it is process 2 that gets to run when the serial driver reaches receive again. Process 2 now continues to increment r until the next interrupt, with process 1 sitting ready in the queue, thus establishing the conditions for the same behaviour to occur at the next interrupt.

Before long, process 1 has captured its 10 observations of r and converted them to output text, which the serial driver is now storing in its buffer, which has plenty of space for the entire output of the program. In the next phase, process 2 runs constantly, except for brief spells when the serial driver is responding to an interrupt by sending a character from its buffer to the UART. This scope trace shows the transition from the previous phase, with process 1 capturing its last value, then briefly running again just to tidy up and exit, with a horizontal scale of 500μs per divisions.

Phase 2 of the run

In the last phase, process 2 has finished incrementing r and exited, and the processor is idle between interrupts while the serial buffer drains. This scope trace shows the transition from phase 2 to phase 3, with process 2 exiting midway between one UART interrupt and the next. Horizontal scale of 500μs per division again

Phase 3 of the run

When phase 3 is over, UART becomes idle, the serial driver blocks waiting for a message that will never arrive, processes 1 and 2 have both exited, and the machine goes to sleep, never to wake up until we push reset to start another run.

@@ Line 1: / Line 1: @@
-<i>This page reports an experiment done with {{Phos}}, the predecessor of {{microbian}}; the conventions differ a little (for example, when starting a process, we choose the PID, rather than having @start()@ decide and return it.  More importantly, in {{microbian}} the serial device driver has an improved interface that allows @printf()@ to buffer characters internally, then pass them to the serial driver all at once, reducing the number of context switches.  This makes the results different numerically, and also changes the sequence of events.</i>
 Problem 4.4 is about a program containing two user processes, plus the serial driver.
 <pre>
-#include "phos.h"
+#include "microbian.h"
+#include "hardware.h"
+#include "lib.h"
 static volatile int r = 0;
@@ Line 9: / Line 9: @@
 void proc1(int n) {
      for (int i = 0; i < 10; i++)
-         serial_printf("r = %d\n", r);
+         printf("r = %d\n", r);
 }
 void proc2(int n) {
-     while (r < 1000000) r++;
+     while (r < 100000) r++;
 }
 void init(void) {
      serial_init();
-     start(USER+0, "Proc1", proc1, 0, STACK);
+     start("Proc1", proc1, 0, STACK);
-     start(USER+1, "Proc2", proc2, 0, STACK);
+     start("Proc2", proc2, 0, STACK);
 }
 </pre>
@@ Line 40: / Line 40: @@
 To answer these questions with confidence, I set up an experiment where I instrumented the operating system itself to flash different LEDs as it selected different processes to run, and then I connected an oscilloscope to monitor scheduling decisions as they were taken.  A different approach might have been to have the scheduler record its decisions in a memory buffer for printing at the end of the program.  It would certainly not have worked to have the operating system print its decisions as it went along, since (as we'll see) the printing mechanism is very much part of the phenomenon we're observing.  Although an oscilloscope may be dispensable, it is a satisfyingly phyisical way of visualising what is happening.
-As well as adding the instrumentation code, I also adjusted the program by expanding the buffer of the serial driver process to 256 bytes, more than enough to contain the entire output of the program at once, so that we can be sure that the program does not block waiting for space in the buffer to be free.
 An initial scope trace of the program running shows that the run has three phases.  The horizontal scale here is 20ms per division.
@@ Line 47: / Line 45: @@
 {{Scope2|Experiment-1|Overview of a run}}
-In this screenshot, the top trace (yellow) shows the serial driver process, the middle two traces show process 1 (green) that is capturing and printing values of @r@ and process 2 (blue) that is incrementing @r@.  The bottom trace (red) shows activity of the UART interrupt.  The three phases are as follows:
+In this screenshot, the top trace (yellow) shows the serial driver process, the middle two traces show process 1 (green) that is capturing and printing values of @r@ and process 2 (blue) that is incrementing @r@.  The bottom trace (red) shows activity of the UART interrupt.  Each trace is high when idle, and low when the corresponding process is running; the trace for the interrupt has a brief negatve-going pulse when the interrupt is triggered.  The three phases are as follows:
 * Initially, all three processes and the interrupt are all active.
-* After about 75ms, process 2 has succeeded in increasing @r@ to 100,000, and it terminates.  There is then a brief but intense flurry of activity involving process 1 and the serial driver.
+* After about 10ms, process 1 has succeeded in taking its 10 snapshots of the value of @r@, and it exits.
-* After process 1 has terminated, the serial driver and the interrupt continue to be active for a further 20ms or so.
+* About 75ms from the start, process 2 finishes incrementing @r@ to 100,000, and it also exits.
+* After that, the serial driver process continues to respond to interrupts by sending characters from its buffer to the UART for a further 30ms or so.
-Let's begin by zooming in on the first phase, which is where the counting happens.  From the initial scope trace, it's plain that there are more than 10 and less that 100,000 relevant events.  Here's an interval containing two bursts of activity and the interval between them, with a horizontal scale of 200&mu;s per division.
+Let's concentrate in on the first phase, which is where the counting happens.  From the initial scope trace, it's plain that there are closer to 10 than 100,000 events in this phase.  Here's the start of the run, showing four bursts of activity in the program, with a horizontal scale of 500&mu;s per division.
-{{Scope2|Experiment-2|Interval between bursts}}
+{{Scope2|Experiment-2|Initial activity}}
-Each trace is low when the corresponding process is active; the interrupt trace is high when nothing is happening, and low from the time an interrupt is received until the connected process (in this case, the serial driver process) starts to run.  Interrupts arrive at intervals of slightly more than 1ms, and that is right for a serial line running at 9600 baud.  As you can see, in the interval  between interrupts, process 2 runs constantly; it gets to run about 90% of the time, and so it makes sense that the first phase of the program takes about 75ms.  If the loop to increment @r@ takes 10 cycles per iteration, then the whole process will take about 1,000,000 cycles or 63ms, consistent with the elapsed time.  That also means that, between interrupts, process 2 has time to increase @r@ by about 1,600.
+Each trace is low when the corresponding process is active; the interrupt trace is high when nothing is happening, and briefly low each time an interrupt is received.  Interrupts arrive at intervals of slightly more than 1ms, and that is right for a serial line running at 9600 baud.  As you can see, in the interval  between interrupts, process 2 runs constantly; it gets to run about 90% of the time.  If it is constantly incrementing @r@ in a loop that takes about 10 cycles per iteration, we expect that to take about a million cycles or 63ms to reach 10,000, and that is consistent with an elapsed time of 75ms to complete the process.  It also means that between interrupts, process 2 has time to increase @r@ by about 1,500.
-The running of process 2 is suspended when an interrupt arrives.  Let's zoom in further on the activity that then results, involving the device driver (yellow) and process 1 (green).  The horizontal scale is 10&mu;s per division.
+The running of process 2 is suspended when an interrupt arrives.  Let's zoom in further on the activity that then results, involving the device driver (yellow) and process 1 (green).  The horizontal scale is 50&mu;s per division.
 {{Scope2|Experiment-3|A burst of activity}}
-Here we see that the interrupt causes the connected process, the serial driver, to run.  It was blocked at a call of @receive@, and it now runs, handles the interrupt by starting to transmit the next character, and blocks as @receive@ again.  The operating system now chooses another process to run.  As it happens, process 1 was ready before the interrupt arrived (why this is so will be explained in a moment), and process 2 has joined the queue behind it -- so process 1 is chosen.  It runs very briefly until it reaches a call of @send@ to pass a character to the serial driver.  Since the serial driver is waiting, the message is tranferred immediately, and both processes are then ready to run.  With its higher priority, the serial driver is chosen, and process 1 joins the ordinary queue behind process 2.  The serial driver stores the character in its buffer, then blocks at @receive@.  Notice at this point that process 1 will need to run again before it is ready to send the next character, and the serial driver cannot continue until it does so.  Next, process 2 is chosen to run, and it continues to run until the next interrupt, with process 1 ready and in the queue, thus establishing the conditions for the same behaviour to occur at the next interrupt.
+Here we see that the interrupt causes the connected process, the serial driver, to run.  It was blocked at a call of @receive@, and it now runs, handles the interrupt by starting to transmit the next character, and blocks as @receive@ again.  The operating system now chooses another process to run.  As it happens, process 1 was ready before the interrupt arrived (why this is so will be explained in a moment), and process 2 has joined the queue behind it -- so process 1 is chosen.  It captures a value of @r@ and uses @printf@ to format it as a string, which @printf@ then passes to the serial driver all at once using a @PUTBUF@ message.  The serial driver then runs again, copies the characters into its own buffer, and sends a @REPLY@ message that makes the process 1 ready to run again.  However, process 1 now joins the queue behind process 2, and it is process 2 that gets to run when the serial driver reaches @receive@ again.  Process 2 now continues to increment @r@ until the next interrupt, with process 1 sitting ready in the queue, thus establishing the conditions for the same behaviour to occur at the next interrupt.
-Process 2 increases @r@ by about 1600 between one interrupt and the next, and it takes about 10 interrupts for process 1 to print a line of output and make its next observation of @r@, so it makes sense that @r@ increases by about 16,000 between one observation and the next.  There may be many explanatory factors behind the irregularity in the increments, but one of them is that the printed lines vary in length, and while one line is being printed, process 2 is incrementing @r@ to the value that will be shown in the next line.
+Before long, process 1 has captured its 10 observations of @r@ and converted them to output text, which the serial driver is now storing in its buffer, which has plenty of space for the entire output of the program.  In the next phase, process 2 runs constantly, except for brief spells when the serial driver is responding to an interrupt by sending a character from its buffer to the UART.  This scope trace shows the transition from the previous phase, with process 1 capturing its last value, then briefly running again just to tidy up and exit, with a horizontal scale of 500&mu;s per divisions.
-Though I enlarged the buffer in the serial driver process to ensure it would not become full, the analysis just given shows that I needn't have bothered: process 1 is able to put only one character into the buffer for each character taken out, so the buffer will, during this phase of the program, never contain more than one or two characters.
-In phase 2, process 1 is printing the final few lines (with @r@ = 100,000).  Here is a magnified view of this part of the run, at a horizontal scale of 200&mu;s per division.
 {{Scope2|Experiment-4|Phase 2 of the run}}
-With process 2 no longer soaking up processor time, process 1 is able to send its characters to the serial driver more rapidly: about 50&mu;s per character.  You can see clearly the pause when the process is converting a number to decimal (using multiple slow divisions) before printing it.  Toward the right of the picture, you can see an event where an interrupt arrives while the connected process is already running: the interrupt is saved and delivered as soon as the process is ready for it, and the process runs for about twice the normal time as it deals with the interrupt.
+In the last phase, process 2 has finished incrementing @r@ and exited, and the processor is idle between interrupts while the serial buffer drains.  This scope trace shows the transition from phase 2 to phase 3, with process 2 exiting midway between one UART interrupt and the next.  Horizontal scale of 500&mu;s per division again
-In phase 3, process 1 has exited, and the processor is idle between interrupts, as the serial buffer drains.  Horizontal scale of 200&mu;s per division again
 {{Scope2|Experiment-5|Phase 3 of the run}}
-The time between the operating system noticing an interrupt and its selecting the connected process to run is noticeably shorter here, because the OS doesn't need to put a currently running process into the queue.  This time doesn't reflect the true interrupt latency of the hardware and operating system, because is doesn't include the time take to save context for the interrupt, nor the time taken to restore the context of the connected process.  Measuring such things requires a [[Interrupt latency -- an experiment|different experiment]].
 When phase 3 is over, UART becomes idle, the serial driver blocks waiting for a message that will never arrive, processes 1 and 2 have both exited, and the machine goes to sleep, never to wake up until we push reset to start another run.

Scheduling – an experiment: Difference between revisions

Revision as of 13:35, 2 January 2023

Navigation menu

Search