# Lecture 21 – Architectural elements (Digital Systems)

## Circuits for arithmetic

The XOR gate is very useful for arithmetic circuits.

XOR gate with implementation
XOR
a b z
0 0 0
0 1 1
1 0 1
1 1 0

Though we could make it out of NAND gates (e.g., using the neat circuit shown above), there's a clever implementation using pass transistors that is more economical in space and time.

An XOR gate and an AND gate together make a 'half-adder' that adds two bits, giving a sum and a carry output.

a b c s
0 0 0 0
0 1 0 1
1 0 0 1
1 1 1 0

A 'full adder' accepts a carry input as well as producing a carry output. We can either make it from two half-adders and an OR gate, or we can design in from scratch, using a 3-input XOR gate for the sum and a majority gate for the carry.

The big problem with this circuit is the very long critical path caused by the carry chain. It's possible to do better than this, using more hardware (but still an amount proportional to n) to produce an n-bit sum in log n time.

[A subtracter a - b can be made from an adder by complementing b bitwise and setting the carry-in to 1. For a circuit that can both add and subtract, use a row of XOR gates controlled, like the carry-in, by a signal set to 0 for add and 1 for subtract.

## Decoder and multiplexer

A two-way, one-bit multiplexer.

Two-way multiplexer

An 8-way decoder.

3-to-8 decoder

An n-way multiplexer

Multiplexer

(Replicate all but the decoder for a multi-bit multiplexer.)

A ROM with 8 single-bit locations

An 8x8 ROM (bigger sizes are available)

## Programmable logic

Viewed at a higher level of abstraction, a sequential circuit consists of a number of state-bearing elements that we can model as a row of flip-flops, together with a Boolean function that derives the output and the next state from the input and current state.

Sequential circuit

Rather than implementing the combinational logic with some carefully-crafted arrangement of logic gates, we can consider replacing it with a lookup table implemented as a ROM. This has several advantages:

• designing logic circuits is difficult to do by hand.
• for larger design, regularity – leading to easier chip layout – is more important than minimising the number of gates.
• for prototyping and small scale production, chip count is more important than gate count. (That's why we would use a microcontroller where it gives sufficient performance.)
• time to market, inventory size, and flexibility all militate for board designs that are independent of detailed function.

So programmable logic devices are the way to go.

We can formulate an adequacy argument for sequential circuits using the general scheme shown above. Taking any (deterministic) finite-state machine, we can label each state with a different string of bits, by numbering them in binary or otherwise, and also encode the inputs and outputs of the machine as bit-strings. Then the next-state function becomes a somewhat complicated function from bit-strings to bit-strings, which we know can be implemented in combinational logic.

## Register

A multi-bit register with write enable.

n-bit register

A fully synchronous alternative.

Fully synchronous register

## Register file

A basic, twin-port register file.

Register file

The ARM register file, with special features.

ARM register file

## Arithmetic-logic unit

A simple ALU.

Arithmetic-logic unit

## Barrel shifter

We can make a circuit that can shift left by any distance by taking the shift amount bit by bit, combining in sequence circuits that can shift by 16, 8, 4, 2, 1. Here's a picture of a three-stage shifter that can shift an 8-bit quantity left by any amount from 0 to 7:

Barrel shifter

(I'm not quite sure why this circuit is called a 'barrel' shifter, but I like to imagine the extreme sport where people try to balance on top of a floating barrel by taking large or small steps to left or right.)

That shifter was only able to implement logical-shift-left (lsl). For right shifts (lsr and asr) and rotations (ror), replace the two-way MUXes with five-way MUXes, feeding the three extra inputs of each MUX in stage k with the relevant bits of x LSR 2k or x ASR 2k or x ROR 2k.

The ARM datapath's shifter also produces a Boolean signal that is the last bit shifted out; it's not hard to add that as an extra output, with each stage overriding the signal produced by previous stages.