Note: I've just migrated to a different physical server to run Spivey's Corner,
with a new architecture, a new operating system, a new version of PHP, and an updated version of MediaWiki.
Please let me know if anything needs adjustment! – Mike

Lecture 24 – Three instructions

Copyright © 2017–2023 J. M. Spivey
Revision as of 15:57, 17 July 2022 by Mike (talk | contribs) (Mike moved page Lecture 23 – Three instructions to Lecture 24 – Three instructions)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Decoding instructions

Opcode Bits Instruction
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0–2 0 0 0 Op Imm5 Ry Rx Move shifted register
3 0 0 0 1 1 I Op Rz/Imm3 Ry Rx Add/Subtract
4–7 0 0 1 Op Rw Imm8 Move/Compare/Add/Subtract immediate
8a 0 1 0 0 0 0 Op Ry Rx ALU operations
8b 0 1 0 0 0 1 Op X Ryy Rxx Hi register ops, branches
9 0 1 0 0 1 Rw Imm8 PC-relative load
10–11 0 1 0 1 L 0 0 Rz Ry Rx Load/store with reg offset
12–13 0 1 1 0 L Imm5 Ry Rx Load/store with immed offset
18–19 1 0 0 1 L Rw Imm8 SP-relative load/store
20–21 1 0 1 0 SP Rw Imm8 Load address
22 1 0 1 1 0 0 0 0 S Imm7 Add offset to SP
26–27 1 1 0 1 Cond Disp8 Conditional branch
28 1 1 1 0 0 Disp11 Unconditional branch
30–31 1 1 1 1 H Disp11 Long branch and link

ARM documentation is inconsistent about the naming of register fields, but here we give a different name to each field in the 16-bit instruction that might name a register:

  • Rx = instr<2:0>
  • Ry = instr<5:3>
  • Rz = instr<8:6>
  • Rw = instr<10:8>
  • Ryy = instr<6:3>
  • Rxx = instr<7>:instr<2:0>

Implementing three instructions

adds r1, r2, r3

This instruction adds together the contents of registers r2 and r3, and puts the result in r1, setting the NZCV flags from the result. According to the architecture manual, the format of the machine instruction is as follows:

Adds-rrr format.png

Here Rx is the destination register, Ry is the first register operand, and Rz is the second. So the example instruction adds r1, r2, r3 in machine code is

00011 00 011 010 001

I've separated the five-bit opcode 00011 which will be the index 3 for the instruction in the main decoding ROM. Comparing the remaining two fixed bits in the encoding with those for other instructions like "subtract register" and "add immediate", we can see that one of these bits selects add rather than subtract, and the other denotes the fact that the second operand comes from a register rather than an immediate field of the instruction. We will deal with these aspects later.

The first item in the order of business is to select the correct three registers for reading and writing by establishing values for the control signals cRegSelA, cRegSelB and cRegSelC. These are set to values that denote the three fields [5:3], [8:6] and [2:0] of the instruction.

cRegSelA = Ry
cRegSelB = Rz
cRegSelC = Rx

The hardware will read all three of these registers as the values ra, rb and rc, though we will not use the value of rc in the rest of the instruction. It's cheaper to do this than to stop it happening.

Next, the input to the shifter is taken from rb, so we must set cRand2 appropriately. But note that there is another instruction adds i3 that uses an immediate field instead, and the difference between them is bit 10 of the instruction – 0 for the register and 1 for the immediate. There's a value for the cRand2 control signal that selects this rule, making the shifter input be the register value or the immediate field according to bit 10 of the instruction.

cRand2 = RImm3

This adds instruction does not use the shifter – or rather it does, but requires the shifter to pass through the register value unchanged. So we set the shifter to do a left shift as a default, and peg the shift amount as zero.

cShiftOp = Lsl
cShiftAmt = Sh0

We want the ALU to add the two register values, but notice that there's another instruction that has a 1 in bit 9 and subtracts instead. So we want a rule for the ALU operation that reflects this.

cAluOp = Bit9

The instruction writes the flags, but does not perform a read or write operation on the memory, so that the result of the instruction is the ALU output.

cFlags = True
cMemRd = False
cMemWr = False

Unlike the special instructions for subroutine call, this one does not write the lr register with the address of the next instruction, but it does write the result back into the register selected by cRegC.

cLink = N
cRegWrite = Y

(These are values of type Perhaps = Y | N | C because some instructions make the effect conditional on other signals.)

The bundle of control signals is now complete, and we can summarise it in a single row of the decoding table:

--    cRegA                  cShiftOp      cFlags   cLink
--     |   cRegB      cRand2  |   cShiftAmt | cMemRd | cRegWrite
--     |    |   cRegC  |      |    | cAluOp |  | cMemWr |  mnem     
--     |    |    |     |      |    |    |   |  |  |  |  |   |
3  -> (Ry,  Rz,  Rx,  RImm3, Lsl, S0, Bit9, T, F, F, N, Y, "adds/subs")

str r0, [sp, #48]

This instruction stores the value from r0 into a 4-byte location whose address is at offset 48 from the stack pointer sp. The Architecture Manual shows that the register r0 is specified using field [10:8] of the instruction, and the offset is an 8-bit immediate field.

Str-sp format.png

The immediate value must be multiplied by 4 before being added to the stack pointer to form the address; we will use the shifter to perform this multiplication, and the ALU to add the result to the stack pointer. The particular instruction str r0, [sp, #48] is encoded as

10010 000 00001100

We'll begin by reading some registers. The plan is to use the stack pointer as the first ALU operand, the offset as the second operand (so we don't care what register is read), and to feed to the data memory the value of the register specified in field [10:8] of the instruction.

cRegSelA = Rsp
cRegSelB = __
cRegSelC = Rw

If we replace the don't care value __ with Rx, and that will mean that, for this particular instruction, it will be r4 that is uselessly read, because the bottom three bits of the instruction are 100.

The shifter input is taken from an 8-bit immediate field, and the shifter is set to shift left by 2 bits.

cRand2 = Imm8
cShiftOp = Lsl
cShiftAmt = Sh2

The ALU is set to add.

cAluOp = Add

The instruction doesn't write the flags, doesn't perform a memory read, but does perform a write.

cFlags = False
cMemRd = False
cMemWr = True

It isn't a subroutine call, so doesn't set lr, and as a store instruction it writes no value back into a register.

cLink = N
cRegWrite = N

Here's the resulting line in the decoding table.

--    cRegA                 cShiftOp      cFlags   cLink
--     |   cRegB      cRand2 |   cShiftAmt | cMemRd | cRegWrite
--     |    |   cRegC  |     |    | cAluOp |  | cMemWr |  mnem     
--     |    |    |     |     |    |    |   |  |  |  |  |   |
18 -> (Rsp, __,  Rw,  Imm8, Lsl, S2,  Add, F, F, T, N, N, "str sp")

bgt label

This conditional branch instruction contains the code for condition gt and a signed 8-bit offset.

Bgt format.png

The offset is in two's-complement form, needs to be multiplied by 2, and is relative to pc+4. Thus a branch instruction bgt .-4 that branches back to the instruction next but one preceding itself will have an offset of -8 encoded as

1101 1100 11111100

There is nothing we must do to feed the condition to the functional unit that evaluates it, because that unit functions all the time, even if the condition it is being fed is nonsense. What we must do is set up the rest of the datapath to compute the branch target address and conditionally write it to the pc in place of the default value that is the address of the next instruction.

The register file reads out the value of pc as pc+4 for the sake of other instructions that might read the pc, so we need just

cRegSelA = Rpc
cRegSelB = __
cRegSelC = Rpc

The pc value forms one input to the ALU, and the other is the sign-extended immediate field, shifted left by 1.

cRand2 = SImm8
cShiftOp = Lsl
cShiftAmt = Sh1

The ALU is set to add the offset to the pc value

cAluOp = Add

Branch instructions do not set the flags, and do not need a read or write cycle from the memory.

cFlags = False
cMemRd = False
cMemWr = False

This branch instruction does not set lr, and it writes the pc only if the condition is satisfied.

cLink = N
cRegWrite = C

Note that the fifth bit of the opcode overlaps with the high-order bit of the condition, so both 26 and 27 are opcodes corresponding to this instruction.

--    cRegA                 cShiftOp      cFlags   cLink
--     |   cRegB      cRand2 |   cShiftAmt | cMemRd | cRegWrite
--     |    |   cRegC  |     |    | sAluOp |  | cMemWr |  mnem     
--     |    |    |     |     |    |    |   |  |  |  |  |   |
26 -> (R15, __,  R15, SI8,  Lsl, S1,  Add, F, F, F, N, C, "bcond")
27 -> (R15, __,  R15, SI8,  Lsl, S1,  Add, F, F, F, N, C, "bcond")