Lecture 4 – Number representations (Digital Systems)

From Spivey's Corner
Jump to: navigation, search

Signed and unsigned numbers[edit]

So far we have used only positive numbers, corresponding to the C type unsigned. We can pin down the properties of this data type by defining a function bin(a) that maps an n-bit vector a to an integer:


(where n = 32). What is the binary operation ⊕ that is implemented by the add instruction? It would be nice if always


but unfortunately that's not possible owing to the limited range 0 ≤ bin(a) < 2n. All we can promise is that


and since each bit vector maps to a different integer mod 2n, this equation exactly specifies the result of ⊕ in each case.

More commonly used than the unsigned type is the type int of signed integers. Here the interpretation of bit vectors is different: we define twoc(a) by


Taking n = 8 for a moment:

    a        bin(a)   twoc(a)
0000 0000       0        0
0000 0001       1        1
0000 0010       2        2
0111 1111     127      127 = 27-1
1000 0000     128     -128 = -27
1000 0001     129     -127
1111 1110     254       -2
1111 1111     255       -1

Note that the leading bit is 1 for negative numbers. So we see −2n−1twoc(a) < 2n−1, and


So twoc(a) ≡ bin(a) (mod 2n). Therefore if bin(ab) ≡ bin(a) + bin(b) (mod 2n), then also twoc(ab) ≡ twoc(a) + twoc(b), and the same addition circuit can be used for both signed and unsigned arithmetic – a telling advantage for this two's complement representation.

How can we negate a signed integer? If we compute ā such that āi = 1 − ai , then we find


So to compute −a, negate each bit, then add 1. This works for every number except −2n−1, which like 0 gets mapped to itself.


This two's-complement binary representation of numbers is typical of every modern computer: the vital factor is that the same addition circuit can be used for both signed and unsigned numbers. Historical machines used different number representations: for example, in business computing most programs used to do a lot of input and output and only a bit of arithmetic, and there it was better to store numbers in decimal (or binary-coded decimal) and use special circuitry for decimal addition and subtraction, rather than go to the trouble of converting all the numbers to binary on input and back again to decimal on output, something that would require many slow multiplications and divisions. Floating point arithmetic has standardised on a sign-magnitude representation because two's-complement does not simplify things in a signficant way. That makes negating a number as simple as flipping the sign bit, but it does means that there are two representations of zero – +0 and −0 – and that can be a bit confusing.

Comparisons and condition codes[edit]

To compare two signed numbers a and b, we can compute ab and look at the result:

  • If ab = 0, then a = b.
  • If ab is negative (sign bit = 1), then it could be that a < b, or maybe b < 0 < a and the subtraction overflowed. For example, if a = 100 and b = −100, then ab = 200 ≡ −56 (mod 256) in 8 bits, so ab appears negative when the true result is positive.
  • Similar examples show that, with a < 0 < b, the result of ab can appear positive when the true result is negative. In each case, we can detect overflow by examining the signs of a and b and seeing if they are consistent with the sign of the result ab.

The cmp instruction computes ab, throws away the result, and sets four condition code bits:

  • N – the sign bit of the result
  • Z – whether the result is zero
  • V – overflow as explained above
  • C – the carry output of the subtraction

A subsequent conditional branch can test these bits and branch if an appropriate condition is satisfied. A total of 14 branch tests are implemented.

equality     signed              unsigned                  miscellaneous
--------     ------              --------                  -------------
beq*  Z      blt  N!=V           blo = bcc*  !C            bmi*  N
bne*  ~Z     ble  Z or N!=V      bls         Z or !C       bpl*  !N
             bgt  !Z and N=V     bhi         !Z and C      bvs*  V
             bge  N=V            bhs = bcs*  C             bvc*  !V

The conditions marked * test individual condition code bits, and the others test meaningful combinations of bits. For example, the instruction blt tests whether in the comparison a < b was true, and that is true if either the subtraction did not overflow and the N bit is 1, or the subtraction did overflow and the N bit is 0 – in other words, if N ≠ V. All the other signed comparisons use combinations of the same ideas: it's pointless to memorise all the combinations.

For comparisons of unsigned numbers, it's useful to work out what happens to the carries when we do a subtraction. For example, in 8 bits we can perform the subtraction 32 − 9 like this, adding together 32, the bitwise complement of 9, and an extra 1:

32      0010 0000
~9      1111 0110
      1 0001 0111

The result should be 2310 = 101112, but as you can see, if performed in 9 bits there is a leading 1 bit that becomes the C flag. If we subtract unsigned numbers ab in this way, the C is 1 exactly if a >= b, and this give the basis for the unsigned conditional branches bhs (= Branch if Higher or Same), etc.

The cmp instruction has the sole purpose of setting the condition codes, and throws away the result of the subtraction. But the subs instruction, which saves the result of the subtraction in a register, also sets the condition codes: that's the meaning of the s suffix on the mnemonic. (The big ARMs have both subs that does set the condition codes, and sub that doesn't.) Other instructions also set the condition codes in a well-defined way. If the codes are set at all, Z always indicates if the result is zero, and N is always equal to the sign bit of the result.

  • In an adds instruction, the C flag is set to the carry-out from the addition, and the V flag indicates if there was an overflow, with a result whose sign is inconsistent with the signs of the operands.
  • In a shift instruction like lsrs, the C flag is set to the last bit shifted out. That means we can divide the number in r0 by 2 with the instruction lsrs r0, r0, #1 and test whether the original number was even with a subsequent bcc instruction. Shift instructions don't change the V flag.


The four status bits NZCV are almost universal in modern processor designs. The exception is machines like the MIPS and DEC Alpha that have no status bits, but allow the result of a comparison to be computed into a register: on the MIPS, the instruction slt r2, r3, r4 sets r2 to 1 if r3 < r4 and to zero otherwise; this can be followed by a conditional branch that is taken is r0 is non-zero.


Is overflow possible in unsigned subtraction, and how can we test for it?

Overflow happens when the mathematically correct result bin(a) − bin(b) is not representable as bin(c) for any bit-vector c. If bin(a) >= bin(b) then the difference bin(a) − bin(b) is non-negative and no bigger than bin(a), so it is representable. It's only when bin(a) < bin(b), so the difference is negative, that the mathematically correct result is not representable. We can detect this case by looking at the carry bit: C = 0 if and only if the correct result is negative.

Lecture 5

A representation of real numbers with a sign, a mantissa, and an exponent that allows the number to be scaled by a power of two. Typical machines have special registers for holding floating point numbers and special instructions for loading, storing and comparing them and performing arithmetic operations on them. In this course, we ignore these machine features for simplicity because they add complexity without presenting any really new problems from the point of view of compiling.

Four bits, N, Z, V and C, in the processor status word that indicate the result of a comparison or other arithmetic operation. Briefly, N indicates whether the result of the operation was negative, Z indicates whether it was zero, C is the value of the carry-out bit from the ALU, and V indicates whether the operation overflowed, yielding a result that was different in sign from what could be predicted from the inputs to the operation. A comparison is treated like a subtraction as far as setting the condition codes is concerned. After the condition codes have been set, a subsequent conditional branch instruction can test them, and make a branch decision based on a boolean combination of their values. All ten arithmetic comparisons (equal, not-equal, and less-than, less-than-or-equal, greater-than, and greater-than-or-equal for both signed and unsigned representations) can be represented in this way. When a process is interrupted, the condition codes must be saved and restored as part of the processor state, in case the interrupt came between a comparison and a subsequent conditional branch.