Note: I've just migrated to a different physical server to run Spivey's Corner,
with a new architecture, a new operating system, a new version of PHP, and an updated version of MediaWiki.
Please let me know if anything needs adjustment! – Mike

Lecture 8 – Introducing I/O (Digital Systems)

Copyright © 2024 J. M. Spivey
Jump to navigation Jump to search

We now move beyond the core of the processor (designed by ARM) to look at I/O devices (designed by Nordic Semiconductor and documented in their separate datasheet). The simplest I/O device is called GPIO (General Purpose Input/Output) and allows 32 pins of the chip to be individually configured either as outputs, producing a voltage under control of the program, or as inputs, so that the program can sense whether an externally applied voltage on the pin is high or low. On the micro:bit, we can put the GPIO pins to immediate use because some of them are connected to 25 LEDs and two push-buttons.

Basics of LEDs

Basics of LEDs

[8.1] LEDs, like most semiconductor diodes, conduct in only one direction, shown by the arrow on the circuit symbol. Thus, the circuit on the right will produce no light, whereas the circuit on the left does light the LED. It is necessary to connect a resistor in series with the LED in order to control the current. For a 3V supply, about 2V will be dropped across the LED, the remaining 1V will be dropped across the resistor, and a resistor of 220Ω will give a sensible current of 4.5mA through resistor and LED. It doesn't matter which side of the LED the resistor is connected, so long as it is there.

A matrix of LEDs

[8.2] A single GPIO pin can control an LED connected (with a series resistor) between the pin and ground: this is useful as a debugging aid in many prototypes. Alternatively, an LED and resistor could be connected between two GPIO pins A and B, and would light only if A is high and B is low. This is a useless idea, until we see that multiple LEDs can be connected in this way, as in the picture at the right. With these connections, the fifth LED from the left will light if B is high and Y is low; we can prevent the other LEDs from lighting by setting A low and X and Y high. In fact we can light any single LED by setting the pins appropriately, and we can light one group of three LEDs or the other in any pattern by setting either A or B high, and choosing to set X, Y and Z low in a pattern that corresponds to the LEDs we want to light. There are three series resistors at X, Y and Z that to control the current through individual LEDs in each group of three. To show a pattern on all six LEDs, we will have to show the two groups alternately, changing between them quickly enough that the flickering is invisible. This "matrix" of six LEDs is the smallest that gives any saving over wiring the LEDs to individual I/O pins, since it uses five pins rather than six.

[8.3] On the micro:bit, though they are arranged physically in a 5x5 array, the LEDs are wired as three 'rows' of nine LEDs each (with two missing), with a documented correspondence to the physical layout.

Schematic of LEDs and buttons

Bits 4–12 of the GPIO register are used for the column lines, and bits 13–15 for the row lines.

0000|0000|0000|0000|0000|R3 R2 R1 C9|C8 C7 C6 C5|C4 C3 C2 C1|0000

Context

Ad-hoc matrices of LEDs like the one on the micro:bit seem a bit amateurish (though fun). But the idea of LEDs addressed as a matrix is very common, because the 7-segment LED displays seen everywhere are usually built in this way. Each individual digit has seven (or eight with the decimal point) individual anodes and a common cathode. In a multidigit display, we can connect together corresponding anodes from each digit, and connect these through series resistors to eight GPIO pins. The common cathode for each display also gets its own GPIO pin, so the total number of pins needed is 8 plus the number of digits. In cheapo designs, the display can be multiplexed (as on the micro:bit) by bit-banging in software. But there are also special-purpose LED driver chips like the HT16K33 that do the same job in hardware, reducing the load on the processor.

Device registers

[8.4] On the ARM, I/O devices appear as special storage locations (called device registers) in the same address space as RAM and Flash. Loading from or storing to one of these locations gives information about the external world, or has a side-effect that is externally visible. For example, there are two storage locations, one at address 0x50000514 that we shall call GPIO_DIR, and another at address 0x50000504 that we shall call GPIO_OUT. Storing a bit pattern in GPIO_DIR lets us configure the individual pins as inputs or outputs, and storing to GPIO_OUT sets the voltage on each output pin to high (if 1) or low (if 0). We can set these locations like any other, using an str instruction with a specific address; so to set the GPIO_DIR register to the constant 0xfff0 (thereby configuring 12 of the GPIO pins as outputs), we can use the code

ldr r0, =0x50000514
ldr r1, =0xfff0
str r1, [r0]

In a C program, things are a bit easier. There's a header file hardware.h that contains definitions of all the device registers we shall use in the course; including it allows us to use GPIO_DIR in a program to denote the device register with address 0x50000514, and we can store the constant 0xfff0 in that register just by writing

GPIO_DIR = 0xfff0;

The compiler translates this into exactly the code shown earlier. You need not delve into the details except to satisfy your curiosity.

The GPIO_DIR register has a row of latches that remember whether each GPIO pin is configured as an input (0) or an output (1), and the GPIO_OUT register has a row of latches that remember the output voltage that has been selected for each pin, either low (0) or high (1). Internally, the same electronic signals that for an ordinary RAM location would cause the RAM to update the contents of a word in this case sets the latches and therefore the output directions or values. (On other architectures such as the x86, I/O devices usually do not appear as storage locations accessed with the usual load and store instructions, but there are separate in and out instructions, and effectively a separate address space.) To use the LEDs, we first need to configure the GPIO port so that the relevant bits are outputs: this is achieved with the assignment GPIO_DIR = 0xfff0.

To light a single LED, we should connect its row to +3.3V and its column to 0V. Then current can flow through from the row pin, through the LED and its series resistor, to the column pin, and the LED will light up. The series resistor limits the current that can flow through the LED to a value that is safe both for the LED and the microcontroller pins that are connected to it. If we set the other row lines, apart from the one belonging to our chosen LED, to 0V and the other column lines to +3.3V, then other LEDs in the same row or column as the lit LED will have their anode and cathode at the same potential, and will carry no current. LEDs that are not in the same row or column will have their anode and 0V and their cathode at +3.3V, so they will be reverse biased, and they also will carry no current, and only the one chosen LED will light. So to light the middle LED in the 5x5 array, which is electrically in row 2 and column 3, we set @GPIO_OUT like this:

...   |R3 R2 R1 C9 |C8 C7 C6 C5 |C4 C3 C2 C1 |
...00 | 0  1  0  1 | 1  1  1  1 | 1  0  1  1 | 0000
             5           f             b         0

The needed assignment is GPIO_OUT = 0x5fb0.

Multiplexing

[8.5] We could light all 25 LEDs at once by setting all the column lines high and all the rows low – 0xe000. Each individual LED would be a bit dimmer, because the current available in each column (set by the 220Ω series resistor) would be shared among three LEDs. For most patterns, however, we will need to multiplex the three rows, lighting the correct LEDs in each row in turn, and pausing a bit before moving on the next row. If the pauses are short enough, persistence of vision will make it seem that all the LEDs making up the pattern are lit together. For example, a heart pattern

. X . X .
X X X X X
X X X X X
. X X X .
. . X . .

is made by lighting LEDs 5, 6, 7, 9 in row 1, LEDs 1, 2, 3, 4, 5 in row 2, and LEDs 1, 4, 5, 6, 7, 8, 9 in row 3, so we want to use the bit patterns

0010 1000 1111 0000 = 0x28f0
0101 1110 0000 0000 = 0x5e00
1000 0000 0110 0000 = 0x8060

in succession.

[8.6] The code might be

while (1) {
    GPIO_OUT = 0x28f0;
    delay(JIFFY);
    GPIO_OUT = 0x5e00;
    delay(JIFFY);
    GPIO_OUT = 0x8060;
    delay(JIFFY);
}

A suitable value of JIFFY would be 5000, for a delay of 5 milliseconds expressed in units of 1 microsecond. Then an iteration of the loop will take just about 15 milliseconds, or about 66 frames per second, fast enough that no flickering will be visible.

[8.7] Displaying a single pattern is all very well, but if we want to display different patterns at different times, then we should not write separate code (with embedded hexadecimal constants) for each pattern, but rather represent the patterns as data, an array of three integers. We can create patterns by declaring initialised arrays like this:

static const unsigned heart[] = {
    0x28f0, 0x5e00, 0x8060
};

The static makes this a variable visible only from the file of code where it is declared, and the const makes it read-only, so that the C compiler and linker can put it in ROM instead of using (three words of) precious RAM space.

Then we can write a function like this that displays an arbitrary pattern, repeating the three rows n times:

/* show -- display three rows of a picture n times */
void show(const unsigned *img, int n) {
    while (n > 0) {
        // Takes 15msec per iteration
        for (int p = 0; p < 3; p++) {
            GPIO_OUT = img[p];
            delay(JIFFY);
        }
        n--;
    }
}

To make things simpler for the application programmer, we could also think about automating the process of deriving the hexadecimal constants, so that another pattern could be defined like this:

static const unsigned hollow[] =
    IMAGE(0,1,0,1,0,
          1,0,1,0,1,
          1,0,0,0,1,
          0,1,0,1,0,
          0,0,1,0,0);

I'll leave that as something for C experts to investigate!

What about those delays? The easiest way of implementing them is to write a loop that counts down from some constant, creating a predictable delay. With a bit of care, we can write a loop that takes 500ns per iteration, to do delay d microseconds we need to make it iterate 2d times.

void delay(unsigned usec) {
    unsigned n = 2*usec;
    while (n > 0) {
        nop(); nop(); nop();
        n--;
    }
}

Looking at the compiler output (and timing it with a scope) reveals that the loop takes 5 cycles per iteration without the nop instructions, so adding three of them brings it up to 8 cycles per iteration, or 500nsec at 16MHz. Delays were commonly implemented like this in old-fashioned MS-DOS games, and the games become unplayable when they were moved to a machine with a faster clock than the original PC. The same thing might happen to us if next year's micro:bit has a faster clock, or if improvements in the C compiler were to affect its translation of the loop.

A more serious problem with delay loops is that they force the machine to do nothing useful during the delay. That is a problem we will solve later – by using a hardware timer in place of the delay loop, then making the program interrupt controlled, and ultimately by introducing an operating system that is able to schedule other processes to run while waiting for the timer to fire.

The push-buttons on the micro:bit are connected to other GPIO pins that can be configured as inputs. For various reasons, the buttons are wired with pull-up resistors as shown in the schematic above, so that the input signal to the chip is +3.3V if the button is not pushed (a logical 1), and drops to 0V (a logical 0) when it is pushed. Lab 2 begins with a program (an electronic Valentine's card) that shows a beating heart pattern, and asks you enhance it so that pressing the buttons changes pattern shown on the display.

Pushbuttons

Pushbutton with pullup resistor

The micro:bit has two 'user' pushbuttons on the front of the board (in addition to the reset button on the back). These are connected to a microcontroller pin using a pull-up resistor, as shown in the circuit diagram at the right, so that the pin sees a voltage of +3.3V when the button is released, and 0V when it is pressed.

The two pins that are connected to buttons are identified by the constants BUTTON_A and BUTTON_B defined in hardware.h. In order to use the buttons in a program, we must first configure those pins so they are connected as inputs. There's an array of I/O registers called GPIO_PINCNF, with one element to control the configuration of each GPIO pin. To connect a pin for input, we must set the GPIO_PINCNF_INPUT field of the relevant element to the value GPIO_INPUT_Connect, like this:

SET_FIELD(GPIO_PINCNF[BUTTON_A], GPIO_PINCNF_INPUT, GPIO_INPUT_Connect);

Since the power-up values of all other fields in these registers is zero, and the value of GPIO_PINCNF_Connect is also zero, it's more common to do this job with the simpler assignment

GPIO_PINCNF[BUTTON_A] = 0;

Once this configuration has been done (for BUTTON_B also), the program can discover the state of the buttons by reading the register GPIO_IN and looking at the relevant bits, like this:

unsigned x = GPIO_IN;
if (GET_BIT(x, BUTTON_A) == 0 || GET_BIT(x, BUTTON_B) == 0) {
    // A button is pressed
}

You can easily design variants of this code for yourself, for example to do different things depending on which button is pressed.

Physical pushbuttons tend to suffer from the problem that the signal changes state several times over a period of several milliseconds after the button is pressed or released, a phenomenon called contact bounce. If software is going to treat a button press as an event rather than a state (such as counting the number of button presses), then it's wise to implement a scheme where contact bounce is filtered out, such as not reacting to a press until the button has produced the same value in several samples taken several milliseconds apart.

Questions

What does the assignment statement GPIO_OUT = 0x4000 look like in assembly language?

GPIO_OUT is a macro defined in hardware.h as (* (volatile unsigned *) 0x50000504). This definition exploits the weak type system of C, and denotes an unsigned variable at the numeric address 0x50000504, a constant obtained from the nRF51822 datasheet. The volatile keyword amounts to saying to the C compiler, "Please just do any assignments immediately: don't try to be clever and optimise in any way, such as combining one assignment with a later one that targets the same address."

To achieve the same effect in assembly language, we need an str instruction for the assignment. But both the address being assigned to and the value being assigned are large constants, so we'll need to put the constants in registers first, using the pc-relative load instructions for which "ldr =" is a shorthand. So suitable code is

ldr r0, =0x4000
ldr r1, =0x50000504
str r0, [r1]

Alternatively, we might put the address of the whole GPIO device in a register and address the individual device registers at small offsets from that base address.

ldr r0, =0x4000
ldr r1, =0x50000500
str r0, [r1, #0x4]

How does the IMAGE macro work?

The declaration

const unsigned heart[] =
    IMAGE(0,1,0,1,0,
          1,1,1,1,1,
          1,1,1,1,1,
          0,1,1,1,0,
          0,0,1,0,0);

defines heart to be an array of constants that represent the heart image. It is reduced to a simple array declaration by a process of textual substitution (performed by the C preprocessor), followed by simplification (performed by the C compiler proper). Let's follow the process with the example. When compiling for V1, the IMAGE macro is defined like this:

#define IMAGE(x11, x24, x12, x25, x13,                               \
              x34, x35, x36, x37, x38,                               \
              x22, x19, x23, x39, x21,                               \
              x18, x17, x16, x15, x14,                               \
              x33, x27, x31, x26, x32)                               \
    { __ROW(ROW1, x11, x12, x13, x14, x15, x16, x17, x18, x19),      \
      __ROW(ROW2, x21, x22, x23, x24, x25, x26, x27, 0, 0),          \
      __ROW(ROW3, x31, x32, x33, x34, x35, x36, x37, x38, x39) }

See how the documented wiring pattern is reflected in the order that the 25 arguments to the macro are listed? So as a first step, the declaration of heart is expanded into

const unsigned heart[] =
    { __ROW(ROW1, 0, 0, 0, 0, 1, 1, 1, 0, 1),
      __ROW(ROW2, 1, 1, 1, 1, 1, 0, 0, 0, 0),
      __ROW(ROW3, 1, 0, 0, 1, 1, 1, 1, 1, 1) };

But __ROW is itself a macro, defined by

#define __ROW(r, c1, c2, c3, c4, c5, c6, c7, c8, c9)                 \
    (BIT(r) | !c1<<4 | !c2<<5 | !c3<<6 | !c4<<7 | !c5<<8             \
     | !c6<<9 | !c7<<10 | !c8<<11 | !c9<<12)

(and ROW1, ROW2, ROW3 and BIT are also macros), so the preprocessor ultimately expands the declaration into

const unsigned heart[] =
    { ((1 << (13)) | !0<<4 | !0<<5 | !0<<6 | !0<<7 | !1<<8 | !1<<9 | !1<<10 | !0<<11 | !1<<12),
      ((1 << (14)) | !1<<4 | !1<<5 | !1<<6 | !1<<7 | !1<<8 | !0<<9 | !0<<10 | !0<<11 | !0<<12), 
      ((1 << (15)) | !1<<4 | !0<<5 | !0<<6 | !1<<7 | !1<<8 | !1<<9 | !1<<10 | !1<<11 | !1<<12) };

Because the macro expansion process is purely textual, it's possible to have a macro that expands into an arbitrary piece of program text, not just a single expression. In this case, the IMAGE macro expands into a curly-bracketed list of expressions that form the initialiser in an array declaration.

When the compiler itself encounters this expanded declaration, it starts to simplify it. It's a rule of C that an expression with a constant value is allowed wherever a simple constant may appear, so we may be sure that the C compiler will do its best. First, the ! operator maps 0 to 1 and 1 to 0, so that the bits have the right polarity. Next, the << operator shifts each bit to the right place, and the | operator combines the bits into a single word. There are a few extra parentheses in the expression that have been inserted by various macros for safety, and these are just ignored by the compiler as they should be. The upshot is that the compiler treats the declaration as if it had been written

const unsigned heart[] = {
    0x28f0, 0x5e00, 0x8060
};

but with a lot less effort on the part of the programmer. The calculation of these bit patterns happens entirely at 'compile time', while the compiler is deciding what binary data will make up the code that it outputs. The compiler reduces the input expression to a simple table, and no calculations of bit patterns happen at 'run time', while the program is running.

Lecture 9