Lecture 24 – Three instructions (Digital Systems)
|0–2||0||0||0||Op||Imm5||Rn||Rd||Move shifted register|
|8b||0||1||0||0||0||1||Op||D||RHn||Rd||Hi register ops, branches|
|10–11||0||1||0||1||L||0||0||Rm||Rn||Rd||Load/store with reg offset|
|12–13||0||1||1||0||L||Imm5||Rn||Rd||Load/store with immed offset|
|22||1||0||1||1||0||0||0||0||S||Imm7||Add offset to SP|
|30–31||1||1||1||1||H||Disp11||Long branch and link|
ARM documentation is inconsistent about the naming of register fields, but here we give a different name to each field in the 16-bit instruction that might name a register:
- Rd = instr<2:0>
- Rn = instr<5:3>
- Rm = instr<8:6>
- Rt = instr<10:8>
- RHn = instr<6:3>
- RHd = instr<7>:instr<2:0>
Implementing three instructions
adds r1, r2, r3
This instruction adds together the contents of registers
r3, and puts the result in
r1, setting the NZCV flags from the result. According to the architecture manual, the format of the machine instruction is as follows:
Rd is the destination register,
Rn is the first register operand, and
Rm is the second. So the example instruction
adds r1, r2, r3 in machine code is
00011 00 011 010 001
I've separated the five-bit opcode
00011 which will be the index 3 for the instruction in the main decoding ROM. Comparing the remaining two fixed bits in the encoding with those for other instructions like "subtract register" and "add immediate", we can see that one of these bits selects add rather than subtract, and the other denotes the fact that the second operand comes from a register rather than an immediate field of the instruction. We will deal with these aspects later.
The first item in the order of business is to select the correct three registers for reading and writing by establishing values for the control signals
cRegSelC. These are set to values that denote the three fields [5:3], [8:6] and [2:0] of the instruction.
cRegSelA = Rn cRegSelB = Rm cRegSelC = Rd
The hardware will read all three of these registers as the values
rc, though we will not use the value of
rc in the rest of the instruction. It's cheaper to do this than to stop it happening.
Next, the input to the shifter is taken from
rb, so we must set
cRand2 appropriately. But note that there is another instruction
adds i3 that uses an immediate field instead, and the difference between them is bit 10 of the instruction – 0 for the register and 1 for the immediate. There's a value for the
cRand2 control signal that selects this rule.
cRand2 = RegB
White lie: bit 10 in the instruction can interpret the three-bit field as an immediate instead.
adds instruction does not use the shifter – or rather it does, but requires the shifter to pass through the register value unchanged. So we set the shifter to do a left shift as a default, and peg the shift amount as zero.
cShiftOp = Lsl cShiftAmt = Sh0
We want the ALU to add the two register values, but notice that there's another instruction that has a 1 in bit 9 and subtracts instead. So we want a rule for the ALU operation that reflects this.
cAluOp = Add
White lie: bit 9 in the instruction chooses between add and subract.
The instruction writes the flags, but does not perform a read or write operation on the memory, so that the result of the instruction is the ALU output.
cFlags = True cMemRd = False cMemWr = False
Unlike the special instructions for subroutine call, this one does not write the
lr register with the address of the next instruction, but it does write the result back into the register selected by
cLink = N cRegWrite = Y
(These are values of type
Perhaps = Y | N | C because some instructions make the effect conditional on other signals.)
The bundle of control signals is now complete, and we can summarise it in a single row of the decoding table:
-- cRegA cShiftOp cFlags cLink -- | cRegB cRand2 | cShiftAmt | cMemRd | cRegWrite -- | | cRegC | | | cAluOp | | cMemWr | mnem -- | | | | | | | | | | | | | 3 -> (Rn, Rm, Rd, RIm3, Lsl, S0, Sg9, t, f, f, N, Y, "adds/subs")
str r0, [sp, #48]
This instruction stores the value from
r0 into a 4-byte location whose address is at offset 48 from the stack pointer
sp. The architecture manual shows that the register
r0 is specified using field [10:8] of the instruction, and the offset is an 8-bit immediate field.
The immediate value must be multiplied by 4 before being added to the stack pointer to form the address; we will use the shifter to perform this multiplication, and the ALU to add the result to the stack pointer. The particular instruction
str r0, [sp, #48] is encoded as
10010 000 00001100
We'll begin by reading some registers. The plan is to use the stack pointer as the first ALU operand, the offset as the second operand (so we don't care what register is read), and to feed to the data memory the value of the register specified in field [10:8] of the instruction.
cRegSelA = Rsp cRegSelB = ? cRegSelC = Rt
We'll replace the
Rd, and that will mean that, for this particular instruction, it will be
r4 that is uselessly read.
The shifter input is taken from an 8-bit immediate field, and the shifter is set to shift left by 2 bits.
cRand2 = Imm8 cShiftOp = Lsl cShiftAmt = Sh2
The ALU is set to add.
cAluOp = Add
The instruction doesn't write the flags, doesn't perform a memory read, but does perform a write.
cFlags = False cMemRd = False cMemWr = True
It isn't a subroutine call, so doesn't set
lr, and as a store instruction it writes no value back into a register.
cLink = N cRegWrite = N
Here's the resulting line in the decoding table.
-- cRegA cShiftOp cFlags cLink -- | cRegB cRand2 | cShiftAmt | cMemRd | cRegWrite -- | | cRegC | | | cAluOp | | cMemWr | mnem -- | | | | | | | | | | | | | 18 -> (R13, Rd, Rt, Imm8, Lsl, S2, Add, f, f, t, N, N, "str sp")
This conditional branch instruction contains the code for condition
gt and a signed 8-bit offset.
The offset is in two's-complement form, needs to be multiplied by 2, and is relative to
pc+4. Thus a branch instruction
bgt .-4 that branches back to the instruction next but one preceding itself will have an offset of -8 encoded as
1101 1100 11111100
There is nothing we must do to feed the condition to the functional unit that evaluates it, because that unit functions all the time, even if the condition it is being fed is nonsense. What we must do is set up the rest of the datapath to compute the branch target address and conditionally write it to the
pc in place of the default value that is the address of the next instruction.
The register file reads out the value of
pc+4 for the sake of other instructions that might read the
pc, so we need just
cRegSelA = Rpc cRegSelB = ? cRegSelC = Rpc
pc value forms one input to the ALU, and the other is the sign-extended immediate field, shifted left by 1.
cRand2 = SImm8 cShiftOp = Lsl cShiftAmt = Sh1
The ALU is set to add the offset to the
cAluOp = Add
Branch instructions do not set the flags, and do not need a read or write cycle from the memory.
cFlags = False cMemRd = False cMemWr = False
This branch instruction does not set
lr, and it writes the
pc only if the condition is satisfied.
cLink = N cRegWrite = C
Note that the fifth bit of the opcode overlaps with the high-order bit of the condition, so both 26 and 27 are opcodes corresponding to this instruction.
-- cRegA cShiftOp cFlags cLink -- | cRegB cRand2 | cShiftAmt | cMemRd | cRegWrite -- | | cRegC | | | sAluOp | | cMemWr | mnem -- | | | | | | | | | | | | | 26 -> (R15, Rd, R15, SI8, Lsl, S1, Add, f, f, f, N, C, "bcond") 27 -> (R15, Rd, R15, SI8, Lsl, S1, Add, f, f, f, N, C, "bcond")
(Read-Only Memory). A form of storage whose contents are non-volatile (are not lost when the power is off) but cannot be changed under program control. Modern ROM is usually EEPROM – Electrically Erasable Programmable Read Only Memory, and can be changed electrically, and even under control of a program running on the microcontroller, but using special peripheral registers and not the normal store instructions. Flash memory is a modern, super-compact implementation of EEPROM, but for our purposes it does exactly the same job. We will modify the contents of the micro:bit's flash memory by downloading programs, but we will probably not be writing programs that change the contents of the flash memory.
sp that holds the address of the most recent occupied word of the subroutine stack. On ARM, as on most recent processors, the subroutine stack grows downwards, so that the
sp holds the lowest address of any occupied work on the stack.