Lecture 2 – Building a program (Digital Systems)

From Spivey's Corner
Jump to: navigation, search

Memory map[edit]

Memory map

The other piece of information needed to design programs for the micro:bit is the layout of the memory. As the diagram shows, there are three elements, all within the same address space:

  • 256kB of Flash ROM, with addresses from 0x0000 0000 to 0x0003 ffff. This is where programs downloaded to the board are stored. Significantly, address zero is part of the Flash, because it is there that the interrupt vectors are stored that, among other things, determine where the program starts.
  • 16kB of RAM, with addresses from 0x2000 0000 to 0x2000 3fff. This is where all the variables for programs are stored: both the global variables that are not part of any subroutine, and the stack frames that contain local variables for each subroutine activation.
  • I/O registers in the range from 0x4000 0000 upward. Specific locations in this range correspond to particular I/O devices. For example, storing a character at address UART_TXD = 0x4000 251c will have the effect of transmitting it over the serial port, and storing a 32-bit value at GPIO_OUT = 0x5000 0504 has the effect of illuminating some or all of the LEDs on the board.

It's possible for the contents of Flash to be modified under program control, but we shall not use that feature.

Naturally enough, information about the layout of memory must form part of any programs that run on the machine. In our code, it is reflected in two places: the linker script NRF51822.ld contains the addresses and sizes of the RAM and ROM, and insructions about what parts of the program go where; and the header file hardware.h contains the numeric addresses of I/O device registers like UART_TXD and GPIO_OUT. This information comes from the data sheet for the nRF51822 chip.

To think about: why was the nRF51822 designed with so little RAM?


Other microcontrollers adopt two alternatives to this single-address-space model:
  • Sometimes I/O devices are accessed with special instructions, and have their own address space of numbered 'ports'. This is not really such a significant difference, because performing I/O still amounts to selecting a port by its address and performing a read or write action.
  • Sometimes the processor has its program in a different address space from its data. This is an attractive option for microcontrollers, where the program is usually fixed and can be stored in a ROM separate from the RAM that is used for data. It's called a 'Harvard archtecture', in contrast with the 'von Neuman architecture' with a single memory. If the two address spaces are separate, then separate hardware can be used to access the ROM and the RAM, and it can access both simultaneously without fear of interference between the two. The disadvantage is that special instructions are usually needed to access, e.g., tables of constant data held in the ROM.
More complex processors with cache memory between the processor and RAM often adopt a 'modified Harvard archicture' where there are independent caches, concurrently accessible, for code and data, but with a single, uniform RAM behind them.

Building a program[edit]

Source code: lab1-asm

Lab one for this course lets you write small subroutines in Thumb assembly language and test them, using a main program that prompts for two numbers, calls your subroutine with the two numbers as arguments, then prints the arguments and result.

As you can see, the arguments and result are shown both in decimal and in hexadecimal; you can enter numbers in hexadecimal too by preceding them with the usual 0x prefix.

This program is built from your one subroutine written in assembly language, with the rest of the program written in C for convenience. (At some point, we'll make a program written entirely written in assembly language just to demonstrate that it's possible.) Let's begin by looking at the entire contents of the source file add.s that defines the function foo. The lines starting with @ are comments.

@ This file is written in the modern 'unified' syntax for Thumb instructions:
        .syntax unified

@ This file defines a symbol foo that can be referenced in other modules        
        .global foo

@ The instructions should be assembled into the text segment, which goes
@ into the ROM of the micro:bit

@ Entry point for the function foo
@ ----------------
@ Two parameters are in registers r0 and r1

        adds r0, r0, r1          @ One crucial instruction

@ Result is now in register r0
@ ----------------
@ Return to the caller
        bx lr

The one instruction that matters is the line reading adds r0, r0, r1. Let's assemble the program:

$ arm-none-eabi-as add.s -o add.o

This command takes the contents of file add.s, runs them through the ARM assembler, and puts the resulting binary code in the file add.o. We can see this code by dis-assembling the file add.o, using the utility objdump.

$ arm-none-eabi-objdump -d add.o
00000000 <foo>:
   0:  1840            adds    r0, r0, r1
   2:  4770            bx      lr

This reveals that the instruction adds r0, r0, r1 is represented by the 16 bit value written as hexadecimal 0x1840.

The program also contains four files of C code, and we will need to translate those also into binary form using the compiler arm-none-eabi-gcc. The easy way to do that is to invoke the command make, because the whole procedure for building the program has been described in a Makefile:

$ make
arm-none-eabi-gcc -mcpu=cortex-m0 -mthumb -g -O -c main.c -o main.o
arm-none-eabi-gcc -mcpu=cortex-m0 -mthumb -g -O -c lib.c -o lib.o
arm-none-eabi-gcc -mcpu=cortex-m0 -mthumb -g -O -c startup.c -o startup.o

Notice that the C compiler is being asked to generate Thumb code for the Cortex-M0 core that is present in the micro:bit; it is also being asked to include debugging information in its output (-g), and to optimise the object code a bit (-O). We don't use higher levels of optimisation (-O2 or -Os, because they make the object code harder to understand and interfere with running it under a debugger.)

What are these files of C code anyway? Well, main.c contains the main program, with the loop that prompts for two numbers, passes them to your foo subroutine, then prints the result. It also contains a simple driver for the micro:bit's serial port so that the program can talk to a host PC via USB and minicom. The file lib.c contains a simple implementation of the formatted output function printf that we shall use in most of our programs. Finally, the file startup.c contains the code that runs when the micro:bit starts ("comes out of reset"), setting things up and then calling the main program.

Compiling and linking

Having done all this compiling and assembling, we have four binary files with names ending in .o. To make them into one program, we need to concatenate the code from all these files, then fix them up so that references from one file to another (and within a single file too) use the right address. All this is done by the linker ld, or more accurately arm-none-eabi-ld. Often, we invoke the linker via the C compiler gcc, because that lets gcc add its own libraries into the mix; but so that we can see all that is happening, the Makefile contains a command to invoke ld directly:

arm-none-eabi-ld add.o main.o lib.o startup.o \
    /usr/lib/gcc/arm-none-eabi/5.4.1/armv6-m/libgcc.a \
    -o add.elf -Map add.map -T NRF51822.ld

This links together the various .o files, and also a library libgcc.a that contains routines that the C compiler relies on to translate certain C constructs – in particular, the integer division that is used in converting numbers for printing. The output file add.elf is another file of object code in the same format as the .o files. Another output from the linker is a report add.map that shows the layout of memory, and that layout is determined partially by the linker script NRF51922.ld that describes the memory map of the chip and what each memory area should be used for. You'll see that the Makefile next invokes a program size to report on the size of the resulting program, so we can see how much of the memory has been used: not much in this case.

The file add.elf contains the binary code for the program, but sadly it is in the wrong format for downloading to the board. So the final step of building described in the Makefile converts the code into "Intel hex" format, which is what the board expects to receive over USB.

arm-none-eabi-objcopy -O ihex add.elf add.hex

Building is now done, and we can copy the file add.hex to the board. On my machine, I can type

$ cp add.hex /media/mike/MICROBIT

and the job is done.

Although it's not necessary for running the program, it's fun to check that the binary instructions 0x1840 and 0x4770 appear somewhere in the code. If you look in add.hex, you will find that line 13 is


and that contains the two fragments 4018 and 7047, which rearrange to give 1840 and 4770. (Why is the rearrangement neeed? You can find a specification for Intel hex format on Wikipedia.)

Alternatively, you can make a pure binary image of the program with the command

$ arm-none-eabi-objcopy -O binary add.elf add.bin

and then look at a hexadecimal dump of that file, divided into 16-bit words:

$ hexdump add.bin
00000c0 1840 4770 4b07 681b 2b00 d103 4a06 6813

(Even on a 32-bit machine, tradition dictates that hexdump splits the file into 16-bit chunks.) As you can see, the two instructions 0x1840 and 0x4770 appear at address 0xc0 in the image.


What is the file /usr/lib/gcc/arm-none-eabi/5.4.1/armv6-m/libgcc.a that is mentioned when a program is linked?

It's a library archive (extension .a) containing subroutines that may be called from code compiled with gcc. For example, the program in Lab 1 contains C a subroutine (xtoa in lib.c) for converting numbers from binary to decimal or hexadecimal for printing, and that subroutine contains the statement x = x / base. Since the chip has no divide instruction, gcc translates the statement into a call to a subroutine with the obscure name __aeabi_uidiv, and that subroutine is provided in the library archive. Exercise 1.6 asks you to write a similar subroutine for yourself.

If you're prepared to use that integer division subroutine from the gcc library, why not use the library version of printf too, instead of writing your own?

Using the library version of printf pulls in a lot of other stuff – for example, since the standard printf can format floating-point numbers, including it also drags in lots of subroutines that implement floating-point arithmetic in software. That's OK, but (a) I wanted to keep our programs small for simplicity, and (b) although we are not likely to fill up the 256kB of code space on the nRF51822, on other embedded platforms it's wise to keep a close eye on the amount of library code that is included in the program but not really used. Added to that, the version of printf in the standard library can call malloc, and we don't want that.

Lecture 3

A numbering system for memory locations. ARM-based microcontrollers (like most bigger machines) have a single address space containing both code and data. Some other microcontroller families have separate address spaces for code and data, in what is called a Harvard architecture.

(Read-Only Memory). A form of storage whose contents are non-volatile (are not lost when the power is off) but cannot be changed under program control. Modern ROM is usually EEPROM – Electrically Erasable Programmable Read Only Memory, and can be changed electrically, and even under control of a program running on the microcontroller, but using special peripheral registers and not the normal store instructions. Flash memory is a modern, super-compact implementation of EEPROM, but for our purposes it does exactly the same job. We will modify the contents of the micro:bit's flash memory by downloading programs, but we will probably not be writing programs that change the contents of the flash memory.

(Universal Asynchronous Receiver/Transmitter). A peripheral interface that is able to send and receive characters serially, commonly used in the past for communication between a computer and an attached terminal. It is commonly used in duplex mode, with the transmitter of one device connected to the receiver of the other with one wire, and the receiver of the one connected to transmitter of the other with a different wire. The asynchronous part of the name refers to the fact that the transmitter and receiver on each wire do not share a common clock, but rely instead on the signalling protocol and precise timing to achieve synchronisation.

(General-Purpose Input/Output). A peripheral interface that provides direct access to pins of the microcontroller chip. Pins may be configured as inputs or outputs, and interrupts may be associated with state changes on certain input pins. On the micro:bit, the LEDs and pushbuttons are connected to GPIO pins.

A text, written in a specialised language, that describes the layout in memory of a program. Compilers typically divide their output into four named sections: text for the program code and embedded constants, data for statically allocated data that has a specified initial value other than zero, rodata for initialised data that is constant, and bss for data that is statically allocated but can initially be filled with zeroes. The linker script may add another section for the program's stack. For microcontrollers, a linker script is needed that puts the text and rodata in Flash, and lays out the RAM so that stack and bss are in separate areas, with the data copied into its own part of RAM from an image held in Flash.

An alternative instruction encoding for the ARM in which each instruction is encoded in 16 rather than 32 bits. The advantage is compact code, the disadvantage that only a selection of instructions can be encoded, and only the first 8 registers are easily accessible. In Cortex-M microcontrollers, the Thumb encoding is the only one provided.

A symbolic representation of the machine code for a program.