Lecture 2 – Building a program (Digital Systems)
Building a program
The other piece of information needed to design programs for the micro:bit is the layout of the memory. As the diagram shows, there are three elements, all within the same address space:
- 256kB of Flash ROM, with addresses from 0x0000 0000 to 0x0003 ffff. This is where programs downloaded to the board are stored. Significantly, address zero is part of the Flash, because it is there that the interrupt vectors are stored that, among other things, determine where the program starts.
- 16kB of RAM, with addresses from 0x2000 0000 to 0x2000 3fff. This is where all the variables for programs are stored: both the global variables that are not part of any subroutine, and the stack frames that contain local variables for each subroutine activation.
- I/O registers in the range from 0x4000 0000 upward. Specific locations in this range correspond to particular I/O devices. For example, storing a character at address
UART_TXD = 0x4000 251cwill have the effect of transmitting it over the serial port, and storing a 32-bit value at
GPIO_OUT = 0x5000 0504has the effect of illuminating some or all of the LEDs on the board.
It's possible for the contents of Flash to be modified under program control, but we shall not use that feature.
Naturally enough, information about the layout of memory must form part of any programs that run on the machine. In our code, it is reflected in two places: the linker script
NRF51822.ld contains the addresses and sizes of the RAM and ROM, and insructions about what parts of the program go where; and the header file
hardware.h contains the numeric addresses of I/O device registers like
GPIO_OUT. This information comes from the data sheet for the nRF51822 chip.
To think about: why was the nRF51822 designed with so little RAM?
ContextOther microcontrollers adopt two alternatives to this single-address-space model:
- Sometimes I/O devices are accessed with special instructions, and have their own address space of numbered 'ports'. This is not really such a significant difference, because performing I/O still amounts to selecting a port by its address and performing a read or write action.
- Sometimes the processor has its program in a different address space from its data. This is an attractive option for microcontrollers, where the program is usually fixed and can be stored in a ROM separate from the RAM that is used for data. It's called a 'Harvard archtecture', in contrast with the 'von Neuman architecture' with a single memory. If the two address spaces are separate, then separate hardware can be used to access the ROM and the RAM, and it can access both simultaneously without fear of interference between the two. The disadvantage is that special instructions are usually needed to access, e.g., tables of constant data held in the ROM.
Building a program
Lab one for this course lets you write small subroutines in Thumb assembly language and test them, using a main program that prompts for two numbers, calls your subroutine with the two numbers as arguments, then prints the arguments and result.
As you can see, the arguments and result are shown both in decimal and in hexadecimal; you can enter numbers in hexadecimal too by preceding them with the usual 0x prefix.
This program is built from your one subroutine written in assembly language, with the rest of the program written in C for convenience. (At some point, we'll make a program written entirely written in assembly language just to demonstrate that it's possible.) Let's begin by looking at the entire contents of the source file
add.s that defines the function
foo. The lines starting with
@ are comments.
@ This file is written in the modern 'unified' syntax for Thumb instructions: .syntax unified @ This file defines a symbol foo that can be referenced in other modules .global foo @ The instructions should be assembled into the text segment, which goes @ into the ROM of the micro:bit .text @ Entry point for the function foo .thumb_func foo: @ ---------------- @ Two parameters are in registers r0 and r1 adds r0, r0, r1 @ One crucial instruction @ Result is now in register r0 @ ---------------- @ Return to the caller bx lr
The one instruction that matters is the line reading
adds r0, r0, r1. Let's assemble the program:
$ arm-none-eabi-as add.s -o add.o
This command takes the contents of file
add.s, runs them through the ARM assembler, and puts the resulting binary code in the file
add.o. We can see this code by dis-assembling the file
add.o, using the utility
$ arm-none-eabi-objdump -d add.o
00000000 <foo>: 0: 1840 adds r0, r0, r1 2: 4770 bx lr
This reveals that the instruction
adds r0, r0, r1 is represented by the 16 bit value written as hexadecimal 0x1840.
The program also contains four files of C code, and we will need to translate those also into binary form using the compiler
arm-none-eabi-gcc. The easy way to do that is to invoke the command
make, because the whole procedure for building the program has been described in a
$ make arm-none-eabi-gcc -mcpu=cortex-m0 -mthumb -g -O -c main.c -o main.o arm-none-eabi-gcc -mcpu=cortex-m0 -mthumb -g -O -c lib.c -o lib.o arm-none-eabi-gcc -mcpu=cortex-m0 -mthumb -g -O -c startup.c -o startup.o
Notice that the C compiler is being asked to generate Thumb code for the Cortex-M0 core that is present in the micro:bit; it is also being asked to include debugging information in its output (
-g), and to optimise the object code a bit (
-O). We don't use higher levels of optimisation (
-Os, because they make the object code harder to understand and interfere with running it under a debugger.)
What are these files of C code anyway? Well,
main.c contains the main program, with the loop that prompts for two numbers, passes them to your
foo subroutine, then prints the result. It also contains a simple driver for the micro:bit's serial port so that the program can talk to a host PC via USB and
minicom. The file
lib.c contains a simple implementation of the formatted output function
printf that we shall use in most of our programs. Finally, the file
startup.c contains the code that runs when the micro:bit starts ("comes out of reset"), setting things up and then calling the main program.
Having done all this compiling and assembling, we have four binary files with names ending in
.o. To make them into one program, we need to concatenate the code from all these files, then fix them up so that references from one file to another (and within a single file too) use the right address. All this is done by the linker
ld, or more accurately
arm-none-eabi-ld. Often, we invoke the linker via the C compiler
gcc, because that lets
gcc add its own libraries into the mix; but so that we can see all that is happening, the Makefile contains a command to invoke
arm-none-eabi-ld add.o main.o lib.o startup.o \ /usr/lib/gcc/arm-none-eabi/5.4.1/armv6-m/libgcc.a \ -o add.elf -Map add.map -T NRF51822.ld
This links together the various
.o files, and also a library
libgcc.a that contains routines that the C compiler relies on to translate certain C constructs – in particular, the integer division that is used in converting numbers for printing. The output file
add.elf is another file of object code in the same format as the
.o files. Another output from the linker is a report
add.map that shows the layout of memory, and that layout is determined partially by the linker script
NRF51922.ld that describes the memory map of the chip and what each memory area should be used for. You'll see that the Makefile next invokes a program
size to report on the size of the resulting program, so we can see how much of the memory has been used: not much in this case.
add.elf contains the binary code for the program, but sadly it is in the wrong format for downloading to the board. So the final step of building described in the Makefile converts the code into "Intel hex" format, which is what the board expects to receive over USB.
arm-none-eabi-objcopy -O ihex add.elf add.hex
Building is now done, and we can copy the file
add.hex to the board. On my machine, I can type
$ cp add.hex /media/mike/MICROBIT
and the job is done.
Although it's not necessary for running the program, it's fun to check that the binary instructions 0x1840 and 0x4770 appear somewhere in the code. If
you look in
add.hex, you will find that line 13 is
and that contains the two fragments
7047, which rearrange to give 1840 and 4770. (Why is the rearrangement neeed? You can find a specification for Intel hex format on Wikipedia.)
Alternatively, you can make a pure binary image of the program with the command
$ arm-none-eabi-objcopy -O binary add.elf add.bin
and then look at a hexadecimal dump of that file, divided into 16-bit words:
$ hexdump add.bin ... 00000c0 1840 4770 4b07 681b 2b00 d103 4a06 6813
(Even on a 32-bit machine, tradition dictates that
hexdump splits the file into 16-bit chunks.) As you can see, the two instructions 0x1840 and 0x4770 appear at address 0xc0 in the image.
What is the file
/usr/lib/gcc/arm-none-eabi/5.4.1/armv6-m/libgcc.a that is mentioned when a program is linked?
It's a library archive (extension
.a) containing subroutines that may be called from code compiled with
gcc. For example, the program in Lab 1 contains C a subroutine (
lib.c) for converting numbers from binary to decimal or hexadecimal for printing, and that subroutine contains the statement
x = x / base. Since the chip has no divide instruction,
gcc translates the statement into a call to a subroutine with the obscure name
__aeabi_uidiv, and that subroutine is provided in the library archive. Exercise 1.6 asks you to write a similar subroutine for yourself.
If you're prepared to use that integer division subroutine from the
gcc library, why not use the library version of
printf too, instead of writing your own?
Using the library version of
printf pulls in a lot of other stuff – for example, since the standard
printf can format floating-point numbers, including it also drags in lots of subroutines that implement floating-point arithmetic in software. That's OK, but (a) I wanted to keep our programs small for simplicity, and (b) although we are not likely to fill up the 256kB of code space on the nRF51822, on other embedded platforms it's wise to keep a close eye on the amount of library code that is included in the program but not really used. Added to that, the version of
printf in the standard library can call
malloc, and we don't want that.
A numbering system for memory locations. ARM-based microcontrollers (like most bigger machines) have a single address space containing both code and data. Some other microcontroller families have separate address spaces for code and data, in what is called a Harvard architecture.
(Read-Only Memory). A form of storage whose contents are non-volatile (are not lost when the power is off) but cannot be changed under program control. Modern ROM is usually EEPROM – Electrically Erasable Programmable Read Only Memory, and can be changed electrically, and even under control of a program running on the microcontroller, but using special peripheral registers and not the normal store instructions. Flash memory is a modern, super-compact implementation of EEPROM, but for our purposes it does exactly the same job. We will modify the contents of the micro:bit's flash memory by downloading programs, but we will probably not be writing programs that change the contents of the flash memory.
(Universal Asynchronous Receiver/Transmitter). A peripheral interface that is able to send and receive characters serially, commonly used in the past for communication between a computer and an attached terminal. It is commonly used in duplex mode, with the transmitter of one device connected to the receiver of the other with one wire, and the receiver of the one connected to transmitter of the other with a different wire. The asynchronous part of the name refers to the fact that the transmitter and receiver on each wire do not share a common clock, but rely instead on the signalling protocol and precise timing to achieve synchronisation.
(General-Purpose Input/Output). A peripheral interface that provides direct access to pins of the microcontroller chip. Pins may be configured as inputs or outputs, and interrupts may be associated with state changes on certain input pins. On the micro:bit, the LEDs and pushbuttons are connected to GPIO pins.
A text, written in a specialised language, that describes the layout in memory of a program. Compilers typically divide their output into four named sections:
text for the program code and embedded constants,
data for statically allocated data that has a specified initial value other than zero,
rodata for initialised data that is constant, and
bss for data that is statically allocated but can initially be filled with zeroes. The linker script may add another section for the program's stack. For microcontrollers, a linker script is needed that puts the
rodata in Flash, and lays out the RAM so that
bss are in separate areas, with the
data copied into its own part of RAM from an image held in Flash.
An alternative instruction encoding for the ARM in which each instruction is encoded in 16 rather than 32 bits. The advantage is compact code, the disadvantage that only a selection of instructions can be encoded, and only the first 8 registers are easily accessible. In Cortex-M microcontrollers, the Thumb encoding is the only one provided.
A symbolic representation of the machine code for a program.