Frequently asked questions (Digital Systems)

From Spivey's Corner
Jump to: navigation, search

If you have a question, perhaps it is answered here – or maybe you can find help in the growing glossary. Feel free to add headwords to the glossary so Mike can fill in the definitions, or to add questions below.

Why is the stack space of a Phōs process filled with the value 3735928559 when the process starts?

This helps with finding out how much stack space the process is actually using: when Phōs prints a dump of the process table, it shows how much of the stack space of the process has been changed from this initial value, and we can assume that this is the total stack space that has been used, unless by some slim chance the last few words of the stack have been set to this magic value by the process itself. As to the choice of value, it appears in the Phōs source code in hexadecimal notation, which makes it seem more significant.

Which is faster as a way of writing an infinite loop: while (1) ... or for (;;) ...?

This is one of many things connected with minor efficiency that really don't matter. If I were writing a compiler, I would ensure that both of these idioms resulted in exactly the same code, with the loop body followed by an unconditional branch from the bottom back to the top. If by accident gcc generates slightly less tidy code for one form, then we are unlikely to notice the difference. (Similar remarks apply to the idioms n++ and ++n when written as statements.)

In the UART(Universal Asynchronous Receiver/Transmitter). A peripheral interface that is able to send and receive characters serially, commonly used in the past for communication between a computer and an attached terminal. It is commonly used in ''duplex'' mode, with the transmitter of one device connected to the receiver of the other with one wire, and the receiver of the one connected to transmitter of the other with a different wire. The ''asynchronous'' part of the name refers to the fact that the transmitter and receiver on each wire do not share a common clock, but rely instead on the signalling protocol and precise timing to achieve synchronisation. driver process, code for input and code for output are intertwined. Wouldn't it be neater to untangle them and have separate driver processes for the two sides of the UART?

In some ways, yes. But the vital deciding factor in favour of a single driver is that the two sides of the UART share a single interrupt request, so interrupts for both sending and receiving turn into messages to a single process. We could use a more elaborate scheme where a little router process passed on interrupt messages to one or other of two driver processes, but we would be adding as much complexity as we could hope to remove. (A minor additional point is that echoing of input characters is easy to do if input and output share a driver process.)

In Phōs, what happens if an interrupt is triggered when the attached process is not waiting to receive a message?

The process table entry for each process has space to record one pending interrupt message, so if an interrupt arrives and the process is not waiting for it, the message can be delivered later. The default interrupt handler, which converts interrupts to messages, also disables the interrupt that invoked it, because the interrupts on the micro:bit continue to be triggered if the cause is not removed, and removing the cause (such as an idle UART) requires hardware-specific actions from the driver process. So a second interrupt cannot happen until the driver process has run to re-enable the interrupt. If this behaviour is inadequate to the situation, it's possible to replace the default handler with another one for a specific device.

You keep saying the micro:bit has 16kB (kilobytes) of RAM. That means 16,000 bytes: isn't it true that there is actually 16KiB (kibibytes) of RAM, or 16,384 bytes?

Yes, that's right. You can collect your refund on the way out. Please note as you go that it's correct to put a thin space between a number and a following unit, as in 16 KiB, and that a thin space should also be used in place of a comma to separate thousands, as in 16 000 bytes, to avoid confusion with the use of a comma as a decimal marker in some countries. Glad we sorted that out.

You have not explained the implementation of printf. How does it work?

The implementation of printf in any of our programs is split into two parts: the part that is independent of the output device, and renders the data as a sequence of characters, and the device-dependent part that takes each character and transmits it on the output device. The C code is made slightly complicated by two facts: printf takes a variable number of arguments, and the device-independent part is higher-order function that takes the device-dependent part as a parameter.

Let's begin with the code that puts the two parts together:

void serial_printf(char *fmt, ...) {
    va_list va;
    va_start(va, fmt);
    do_print(serial_putc, fmt, va);

This function takes a variable number of arguments, so we can write a call such as

serial_printf("prime(%d) = %d\n", n, p);

where the one fixed argument – the format string – indicates that there are two additional arguments to come, both integers. The code for serial_printf gets a kind of pointer va to the variable part of the argument list, and passes this pointer, together with the device-dependent output function serial_putc and the format string fmt, to the library routine do_print. Variable argument lists are implemented in different ways on different machines. On the Cortex-M0, the compiler will store the arguments from r0--r3 into the stack frame together with the fifth and later arguments (if any), then use a pointer to the relevant portion of the stack frame for va. What happens on a particular machine is all hidden in the functions va_start and va_end.

I'll simplify the code for do_print a bit: the real version also supports the function sprintf that formats data like printf but saves the resulting string in a character array instead of printing it. Here's the function heading:

void do_print(void (*putch)(char), const char *fmt, va_list va) {

The C syntax is a bit obscure here, but it indicates that do_print takes three arguments:

  • A (pointer to) a function putch that accepts a character as argument and returns nothing.
  • A string fmt, represented as a character pointer.
  • The variable argument list va from do_print's caller.

The body of do_print uses these to do its job.

     char nbuf[NMAX];

     for (const char *p = fmt; *p != 0; p++) {
          if (*p == '%' && *(p+1) != '\0') {
               switch (*++p) {
               case 'c':
                    putch(va_arg(va, int));
               case 'd':
                    do_string(putch, itoa(va_arg(va, int), nbuf));
               case 's':
                    do_string(putch, va_arg(va, char *));
          } else {

As you can see, do_print iterates over the format string, using a pointer p to point to the character of that string that it is currently working on. I've shown the implementation of three format codes here: %c for printing a character, %d for printing a decimal integer, and %s for printing a string. Note that you can write %% to print a literal percent sign. The formatting loop works its way through the variable argument list by using the function va_arg (more likely, a macro), and outputs each character by calling the argument function putch. A helper function do_string is used to call putch on each character of a string, such as the one returned by the function itoa (the name is traditional) that converts an integer into a string.

We carefully avoid global variables so that printf can be called from multiple processes simultaneously – it is reentrant, to use the jargon. That's the reason for the array nbuf that is passed to itoa. None of this will be on the exam.

What registers can a subroutine use?

It can freely use registers r0 to r3, but if it calls any other subroutines, it will need to use some of them to pass arguments to those subroutines, and they can destroy the values in all four registers, returning any result in r0.

In addition, a subroutine can use registers r4 to r7, provided it leaves them with the same values on exit that they had on entry. An easy way to ensure that is to push them on the stack with a push instruction at the routine start (together with the register lr), and pop them off again with a pop instruction at the end (together with the pc).

What if a subroutine has more than 4 arguments?

In that case, arguments beyond the first four are passed on the stack. It's the caller's responsibility to put them there, following a layout that is fixed by the machine's calling convention, and then the subroutine itself can access them using load instructions. I don't think we'll need to write any subroutines with so many arguments (not in assembly languageA symbolic representation of the machine code for a program., anyway).

What do I do about labels if I have a big assembly language program with lots of functions and several loops? Can I re-use labels like loop and done in each function?

No, I'm afraid you can't – you can't have two labels with the same name in a single file of code. You can either prefix each label with the name of the enclosing function, as in foo_loop or baz_done, or you can make use of a system of numeric labels that do allow multiple definitions. There can be several labels written 2: in the file, and you refer to them using the notations 2b or 2f, meaning the next occurrence of label 2 going backwards (2b) or forwards (2f) in the file. That saves the effort of inventing names.

Compilers solve the problem in a different way, just inventing their own labels with names like .L392 and never reusing them. That's OK when nobody is expected to read the resulting assembly language.

When I try running my assembly language program with qemu-arm, I get this message: what does it mean?

qemu-arm: ... tb_lock: Assertion `!have_tb_lock' failed.

That's a symptom of the program running haywire, possibly because it has branched to a random location and started executing code there. A probable cause of that is returning from a subroutine with the stack in a bad state, so that the value loaded into the pc is not the correct return address.

Why are the C compiler and other tools called by names like arm-none-eabi-gcc and arm-none-eabi-objdump?

These long names are chosen to distinguish these cross-compiling tools from native tools like gcc, which on a Linux PC is a compiler that translates C into code for an Intel processor. The lengthy names make explicit that the compiler generates ARM code (arm) for a machine running no operating system (none), and uses a standard set of subroutine calling conventions called the Extended Application Binary Interface (eabi). This explicitness is needed, because a Linux machine might also have installed compilers for the Raspberry Pi (gcc-arm-linux-gnueabihf) and maybe AVR microcontrollers (gcc-avr) and the MIPS (mipsel-linux-gnu-gcc). As you can see, the naming conventions aren't quite consistent between GCC cross-compiler projects.

Why write integer constants in hexadecimal for low-level programming?

We often want to work out the binary bits that make up a constant, and hexadecimal notation makes it easy to do so. Each hexadecimal digit corresponds to four bits of the binary representation, and it's not too hard to memorise that bit pattern corresponding to each digit:

0  0000          8  1000
1  0001          9  1001
2  0010          a  1010
3  0011          b  1011
4  0100          c  1100
5  0101          d  1101
6  0110          e  1110
7  0111          f  1111

To convert a larger hexadecimal constant to binary, just concatenate the four-bit groups corresponding to each digit: for example,

0xcafe1812 = 1100 1010 1111 1110 0001 1000 0001 0010.

If message-passing makes programming so much easier, why do widely-used embedded operating systems provide semaphores instead?

Programs written using semaphores have the potential to be faster than those written using message passing and server processes, because context switches from one process to another can be made much less frequent. For example, to buffer and print characters on the serial port, our message-based design requires about four context switches per character sent, in addition to the running of the UART interrupt handler. There is a context switch to send a request message to the UART driver process and put the character in the buffer, another to return to the main program, then two further context switches into and out of the driver process when a UART interrupt prompts it to send the character. By way of contrast, a semaphore-based solution might need no context switches at all, if characters are retrieved from the buffer and sent to the UART directly in the interrupt handler. This is because a down operation on an available semaphore only needs to disable interrupts for a moment to test and lock the semaphore, and need not block the invoking process. It's only when a process needs to block on a semaphore that a context switch is needed.

On the other hand, message-based concurrency is much easier to think about, and it makes it easier to avoid problems like priority inversion, where a high-priority process is prevented from making progress because it needs to lock a semaphore currently held by a low-priority process.

How do you save a screenshot on the Agilent scope without the Save menu showing?

[This is one of the few FAQs here that really is Frequently Asked: by Mike, to himself].

  1. Press the Save/Recall button.
  2. Set Storage to PNG.
  3. Press External.
  4. Press New File.
  5. Press the Menu On/Off button.
  6. Press the fourth bezel button.
  7. Press Save (the same button again).

It usually works. Always? Not sure. One place in the Rigol firmware that Agilent didn't clean up?

Another glitch: all the files on the USB pen are dated 1 Jan 2006, so Gnome tends to think its thumbnails are up to date even when they aren't.

What is this .thumb_func directive all about?

Most of the time, the assembler and linker manage to work out for themselves that assembly language code is written using the ThumbAn alternative instruction encoding for the ARM in which each instruction is encoded in 16 rather than 32 bits. The advantage is compact code, the disadvantage that only a selection of instructions can be encoded, and only the first 8 registers are easily accessible. In Cortex-M microcontrollers, the Thumb encoding is the only one provided. encoding, and all works sweetly. But just occasionally (e.g., when writing an interrupt handler in assembly language), it's necessary to mark a function explicitly as being written in Thumb code: in the case of an interrupt handler, so that the interrupt vector can be marked with a 1 bit in the LSB. If this is done properly, then the processor stays in Thumb mode when the interrupt vector is followed, and all works well. Failing to do this results in an attempt to switch to native ARM mode, and an immediate HardFault exception: the program crashes irretrievably. This seems a bit silly in a device that doesn't support native ARM mode at all, but it is done in the name of software compatibility with bigger devices.

Given that the .thumb_func directive is occasionally needed (at times when it's difficult to determine whether it is actually necessary), and given that it is hard to debug the HardFault that occurs when a needed directive is omitted, it seems simplest just to make a habit of writing .thumb_func in front of every assembly-language function.

I keep getting the following message when I try to start minicom. What gives?

minicom: cannot open /dev/ttyACM0: Device or resource busy

I found the second answer to this Stack Overflow question helpful:

What are all the numbers shown on an oscilloscope trace?

Here is a trace of a character being sent on a serial line.

RS232 waveform
RS232 waveform
  • In the top left, you can see (STOP) that acquisition of the signal has been stopped, presumably so that I can save a screenshot; at other times, the same label shows WAIT (when the scope is waiting for a trigger event) or TRIG'D (when an acquisition is in progress).
  • Next to it (100us/) we see that the horizontal scale is 100 microseconds per division. The clock period for the waveform can be seen from the trace to be a bit more than 100us, because the transitions fall a bit behind the grid of dotted divisions towards the right of the picture.
  • At the top right, we see that the oscilloscope's trigger is detecting a falling edge on channel 1 and set at a level of 1.72V, midway between the high of 3.3V and the low of 0V.
  • In the left margin, you can see the trigger level marked (T), as well as the 0V level for channel 1.
  • At the bottom left, you can see that channel 1 is DC-coupled, and has a scale of 1.00V per division; the difference between high and low is therefore seen on the trace to be about 3.3V.
  • On the black background just below the top margin, you can see at the left a T mark denoting the trigger point. I've adjusted this to appear towards the left of the display, so that we can see subsequent events, but as the outline at centre top shows, there's plenty more waveform captured in this single-shot mode before and after the trigger that we could see by scrolling.

Why do wired-or busses like I2C have the convention that signal lines are active-low?

Electrically, active-low busses are better in two ways: one, N-type MOSFETs are generally better behaved than P-types, and specifically tend to have a lower on resistance. This makes them better able to pull the signal line low crisply (when it may have significant capacitance) than a P-type could pull it high. In older systems an NPN transistor in an open-collector outputA gate output that consists of a single transistor connected between the output and ground. When the gate is active, it can sink current and pull the output low, but when it is inactive it does not prevent other outputs connected to the same wire from pulling it low. Open collector outputs are commonly used with a separate pull-up resistor connected between the wire and the positive power rail. stage had a similar advantage over a PNP transistor, and for the same physical reason – that the majority carriers in these transistors are electrons, which have a better motility than holes. The second reason for using active-low signalling is that a bus may cross between circuits that are using different power supplies, and ground is then used as a common reference, whereas the power rails may differ a little or a lot in voltage. An active-low signal can have threshold voltages defined with respect to ground, be pulled low by an open-drain outputA gate output that consists of a single transistor connected between the output and ground. When the gate is active, it can sink current and pull the output low, but when it is inactive it does not prevent other outputs connected to the same wire from pulling it low. Open collector outputs are commonly used with a separate pull-up resistor connected between the wire and the positive power rail. with short fall time, and draw little or no quiescent current through its pullup resistor.

Code in the examples that prints messages on the terminal contains "\r\n" to move to a new line, but other examples of C code that I've seen use just "\n". Why is that?

On an old-fashioned terminal, and on the simulation provided by minicom, the carriage-return character 0x0d (written '\r' in C) is needed to move the cursor or print head back to the left of the page, and the line-feed character 0x0a (written '\n') scrolls up by a line. Normally, the C library or operating system is written so as to insert a '\r' in front of each '\n' output by a C program. But we are working without a library and without an operating system, so we must do this ourselves. (The serial driver in Phōs does insert the '\r' in its version of the serial_putc function.)

The code in sysinit.c looks complicated. What does it actually do?

Almost every chip that is designed contains minor errors in its behaviour, which the manufactuer publishes in supplements to the data sheet, and may fix in later revisions. Some of these defects may concern the initialisation of hardware registers when the chip comes out of reset. The code in sysinit.c checks the silicon revision, and if it is early enough that certain defects exist, writes initial values to certain registers to correct the defects. It also establishes an on-chip RC oscillatorA cheap alternative to a crystal oscillator. A capacitor (''C'') charges through a resistor (''R''), until it reaches a threshold voltage, at which a transistor is switched on to discharge it again, creating an oscillation. The frequency of the oscillation is determined by the time constant ''RC''; it is rather less stable than a crystal oscillator, because both the resistor and the capactor tend to have values that vary significantly with temperature. as the source for a 32kHz timing clock, needed by the software library ('softdevice') that implements the Bluetooth protocol. The code spends more effort establishing whether the fixes are needed than actually performing them, because it must look in several read-only hardware registers to establish which revision of the silicon is present, and initialising a register with a wrong value can do more harm than good.

I'm not sure whether any micro:bit board has silicon old enough for these fixes to be needed, and we are unlikely to use the Bluetooth implementation in a setup where the 32kHz oscillator is needed. Certainly, the micro:bit boards I have work perfectly well without invoking SysInit on startup. But it doesn't do any harm to include the code anyway.

If I ask the C compiler to output assembly language (and turn off -g), then most of the file is intelligible, but these lines appear close to the start of the file. What do they mean?

.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1

The lines record the assumptions used by the compiler in generating code: things like whether there is a hardware floating point unit, whether unaligned memory access is allowed, etc. These assumptions will be recorded in a special section of the binary file output by the assembler, and the linker then checks that the assumptions are consistent between different files of code.

Why this Phōs operating system, and not another, industry standard real-time operating system such as FreeRTOS?

FreeRTOS is OK, and provides a rag-bag of facilities for multi-tasking, including processes, semaphores of different kinds, message queues, timers, and a lot more. It is quite complicated, and everything seems to have lots of options that interact in subtle ways. For the course, I wanted to find out how far we could get with a much simpler scheme, where there are no nested interrupts, possibly no preemptive scheduling, and the system is structured into a family of processes communicating by messages. As you will discover in Concurrent Programming next year, programming with semaphores is a complicated business; and next year will be soon enough for that. And I can't resist saying that the coding style of FreeRTOS gives me the creeps.

Why does Phōs not have (i) pre-emptive schedulingA form of process scheduling where a process may be suspended when it has run for a certain time, or a process with higher priority becomes ready. Without pre-emptive scheduling, processes are only suspended when they wait for an event., (ii) nested interrupt handlers, (iii) an interruptible kernel?

In a simple, interrupt-driven system that is not overloaded, most of the time every process is blocked waiting for something to happen, and the idle process is 'active', in the weak sense that it is the running process, but it has put the processor to sleep with a wait-for-interrupt instruction (wfe on the ARM). When an interrupt arrives, processes wake up, there is a flurry of activity, and then the system settles down to sleep again, and the time needed for that to happen is bounded. If there is no response signal due from the system as a result of an interrupt that must be produced in less than this time-bound, then there is no need for pre-emptive scheduling (which does not change the total amount of work to be done, but just affects the order in which to tasks are carried out). And if we keep interrupt times short by doing no more than sending a message in each interrupt handler, then there will be little need to allow the running of one interrupt handler to be suspended in order to tun another. Likewise, if the time spent in delivering a message is short, it won't matter much if an interrupt is delayed until the kernel has finished.

There are well-developed theories of process scheduling that aim to ensure that complex sets of real-time deadlines are met using limited resources, and in circumstances where they are needed, these theories are really valuable. In simple cases, however, process scheduling is hard to get wrong, in the sense that if every process waiting for an event gets to run after the event has happened and before the system goes to sleep again, then everything will happen when it should and in good time. Phōs is designed with these modest goals in mind.

According to the standard books about C, the main program should be called main, and yet in this course it is called init. Why is that?

In normal C, the main program ought to have the heading

int main(int argc, char **argv)

reflecting the fact the main receives two parameters, one being the number of command-line arguments (including the program name), and the other being an array of strings, each corresponding to one of the arguments. The return type is int, because the program may terminate by returning an integer, zero for success and some non-zero value for failure.

Our function init takes no parameters and delivers no result:

void init(void)

What's more, once we have an operating system on the chip, then unlike main, when init returns the program is not over, but concurrent execution of the processes in the program is only just beginning.

These differences might be enough to make a change of name desirable, but there are practical reasons too. One is that GCC (un)helpfully signals a warning if a function named main does not have the heading it expects; another is that it treats main specially in the output, assigning it to its own section in the object file so that a linker scriptA text, written in a specialised language, that describes the layout in memory of a program. Compilers typically divide their output into four named sections: @text@ for the program code and embedded constants, @data@ for statically allocated data that has a specified initial value other than zero, @rodata@ for initialised data that is constant, and @bss@ for data that is statically allocated but can initially be filled with zeroes. The linker script may add another section for the program's stack. For microcontrollers, a linker script is needed that puts the @text@ and @rodata@ in Flash, and lays out the RAM so that @stack@ and @bss@ are in separate areas, with the @data@ copied into its own part of RAM from an image held in Flash. can place it in the program image at a specific place. Both of these behaviours are unhelpful to us.

  • Actually, the warning is suppressed by specifying -ffreestanding, which also suppresses other warnings about the types of functions like sprintf. And the separate section is only enabled with the optimisation flag -Os or -O2, not at the -O level we will generally use, mostly to make it easier to interpret program behaviour under a debugger.

Why use the ARM? Why not the x86 or x86_64 we already have in our laptops?

I wanted us to program a machine where all the code was accessible to our inspection. That's easy to achieve on a microcontrollerA single integrated circuit that contains a microprocessor together with some memory (usually both RAM for dynamic state and ROM for storing a persistent program) and peripheral interfaces., and as simple as it can be on an ARM-based machine. Yes, we could program x86 machines in assembly language, though we would have to face some unavoidable complexity that has accrued during the 30 or more year history of the architecture. But we wouldn't easily get to program the bare machine and understand everything that happens.

Why C? Why not C++? Or why not Python/Java/Scala/Haskell/Some other higher level language?

Quite a lot of embedded programming is done in C++ these days, and in fact the official software support for the micro:bit is in C++. This gives a number of advantages: peripheral devices can be initialised by declaring a static instance of the driver class: PinOut led(PIN3); and C++'s overloading feature subsequently lets you write led = 1; and led = 0; to turn on and off the LED, even if the actual code needed is something more complicated, possibly involving a function call. This is all very well for applications programming, but it really gets in the way of understanding what is happening at the machine level.

Also, for C++, as for other high level languages, we would need to know how the high-level constructs are implemented in order to see how a high level program corresponds to a sequence of machine instructions. For C, this is more straightforward, and we can use C as a quick, clean and portable way of writing what we could also write in assembly language. And in any case, nobody's employment prospects were ever hurt by saying that they knew how to program in C.

C provides bit-fields for access to parts of a memory word. Why don't we use them instead of using shifting and masking operations to program peripheral registers?

On the ARM, most peripheral registers must be accessed as a whole 32-bit word, and can't be used with the ldrb, strb, ldrh, strh instructions that access bytes or 16-bit halfwords. Also, it's sometimes the case that peripheral registers must be accessed exactly once in an operation, or some side-effect may be duplicated. The C compiler doesn't guarantee to respect these conventions when it generates code for bit-field access, so we avoid bit-fields and write (with the help of simplifying macros) explicit shifting and masking code instead. The resulting object code is no worse than would be compiled for bit-fields anyway.

Where does the name Phōs come from?

Phōs (Φῶς) is the Greek word for 'light', and also the first word in Greek of the ancient hymn known as Phos Hilaron (Hail, gladdening Light), particularly associated with John Keble, formerly Fellow of Oriel, in memory of whom Keble College was named by its founder, Edward Pusey (also at one time a Fellow of Oriel). There is also an echo of the motto of the University, Dominus illuminatio mea (The Lord is my light), which consists of the first words of Psalm 27. The macron above the ō in transliteration is intended to show that the letter in Greek is omega rather than omicron, so that the vowel should be pronounced like those in 'photo', not as 'foss'. No allusion is intended to Phở, the Vietnamese noodle soup, though one can see a Phōs application as a collection of noodly threads swimming in a nutritous broth.

Personal tools