Specification of Keiko (Compilers)
This page describes the assembly language of the Keiko machine that is used as a target for the compilers in the course. Some features are omitted if they are not used in the course.
A Keiko program begins with a header, which for our purposes always consists of the text,
MODULE Main 0 0 IMPORT Lib 0 ENDHDR
The header is followed by a sequence of definitions; each may be a procedure, a global variable, or a string constant.
Blank lines are ignored, as are lines that begin with the comment character
A procedure begins with a
PROC directive and ends with an
END directive; between them appears the bytecode of the procedure.
PROC p n 0 0
- Begin a procedure named p that uses n bytes of space for local variables. (The two zeroes replace information that is used when the runtime system contains a garbage collector.)
- End a procedure.
Global variables and string constants
GLOVAR x n
- Define x as a global variable occupying n bytes of space. All global variables are allocated on a 4-byte boundary.
- Define x as a label for the string constant that follows.
- Treat hex as a sequence of pairs of hexadecimal digits, each pair corresponding to one byte. Store the sequence of bytes as a string constant. Each
STRINGdirective generates exactly the number of bytes specified, so several such directives can be used next to each other to build up a long string. If the string is to be zero-terminated, then the last such directive should end with a zero byte, denoted by
Apart from the directives listed above, all other elements in a Keiko program are bytecode instructions that may appear only between a
PROC directive and its matching
END. The different instructions are listed below in groups according to their function; in each group, a basic set of instructions is listed separately from a set of compound instructions that can each be defined as equivalent to a specific sequence of basic instructions.
Note that there's no guarantee that an implementation of Keiko will actually provide all the instructions listed here. Some instructions may be implemented by a sequence of others: specifically, some of the less common instructions that are described as being compound may be expanded by the assembler into the equivalent sequence of basic instructions. Also, some other operations may be implemented by other 'secret' instructions; for example, floating point comparisons are implemented by a special
FCMP instruction followed by an integer comparison.
Addressing, load and store
- The constant n can be a decimal or hexadecimal constant; its value is pushed on the stack.
FCONST x, DCONST x, QCONST n
- Push a floating-point, double-precision or long-integer constant.
- Push the address of a global variable x (declared with a
- Push the address of a local variable that is at offset n from the base pointer.
LOADC, LOADS, LOADW, LOADD
- Pop an address and push the contents of the (8, 16, 32, 64)-bit location it names.
STOREC, STORES, STOREW, STORED
- Pop an address and a one- or two-word value and store the value into the (8, 16, 32, 64)-bit location named by the address.
- Pop an address and an integer offset push their sum. This instruction behaves like integer addition, but is kept separate to help the JIT translator.
- Duplicate the one-word item that is
nwords from the top of the stack – so
DUP 0duplicates the top element on the stack.
- Swap two one-word items at the top of the stack.
none-word items from the stack.
LDLC n, LDLS n, LDLW n, LDLD n
- Load from local variable;
LDLs nis equivalent to
LOCAL n / LOADs.
STLC n, STLS n, STLW n, STLD n
- Store to local variable;
STLs nis equivalent to
LOCAL n / STOREs.
LDGC x, LDGS x, LDGW x, LDGD x
- Load from global variable;
LDGs xis equivalent to
GLOBAL x / LOADs. Here x is a symbol that refers to an address in global storage, and the instruction loads from that address.
STGC x, STGS x, STGW x, STGD x
- Store to global variable;
STGs xis equivalent to
GLOBAL x / STOREs.
INDEXC, INDEXS, INDEXW, INDEXD
- Scaled addition;
INDEXsis equivalent to
PUSH n / TIMES / PLUSAwhere n is 1, 2, 4 or 8.
LDIC, LDIS, LDIW, LDID
- Indexed load;
LDIsis equivalent to
INDEXs / LOADs.
LDNW n, STNW n
- Constant indexed load and store;
LDNW nis equivalent to
PUSH n / PLUSA / LOADW, etc.
All arithmetic instructions expect their operands on the evaluation stack and leave their result on the stack. The binary operations expect to find their right-hand operand on top of the stack and their left-hand operand just beneath it.
PLUS, MINUS, TIMES, DIV, MOD
- Binary arithmetic operations on integers. (
MODare guaranteed to give mathematically correct results).
- Unary minus on integers
EQ, LT, GT, LEQ, GEQ, NEQ
- Integer comparisons with a Boolean result
AND, OR, NOT
- Boolean operations. As in C, these treat any non-zero operand as true, and produce a result that is either 1 or 0.
BITAND, BITOR, BITXOR, BITNOT
- Bitwise boolean operations.
- Given a number between 0 and 31, this unary operation generates a word with the only corresponding bit set. The LSB is numbered 0 and the MSB is numbered 31.
LSL, LSR, ASR
- Shift instructions.
QPLUS, QMINUS, QTIMES, QUMINUS, QDIV, QMOD
- Long integer (64 bit) arithmetic operations.
QEQ, QLT, QGT, QLEQ, QGEQ, QNEQ
- Long integer (64 bit) comparisons.
- Increment and decrement;
INCis equivalent to
CONST 1 / PLUS, etc.
INCL n, DECL n
- Increment or decrement local variable;
INCL nis equivalent to
LDLW n / INC / STLW n, etc.
- Long integer increment and decrement;
QINCis equivalent to
QCONST 1 / QPLUS, etc.
- Bitwise difference, equivalent to
BITNOT / BITAND
FPLUS, FMINUS, FTIMES, FDIV, FUMINUS
- Floating point arithmetic.
FEQ, FLT, FGT, FLEQ, FGEQ, FNEQ
- Floating-point comparisons with a Boolean result.
DPLUS, DMINUS, DTIMES, DDIV, DUMINUS
- Double-precision arithmetic.
DEQ, DLT, DGT, DLEQ, DGEQ, DNEQ
- Double-precision comparisons with a Boolean result.
- Convert integer to floating point.
- Convert integer to double precision.
- Convert floating point to double precision.
- Convert double precision to floating point.
- Convert integer to unsigned character, masking off unwanted bits.
- Convert integer to signed short, sign extending the result from 16 to 32 bits.
- Convert integer to long integer
- Convert long integer to integer
- Convert long integer to double precision
Labels in the assembler are written as decimal integers.
- Jump to label
JUMPF lab, JUMPT lab
- Pop a Boolean value and jump to label
labif the Boolean is false or if it is true, respectively.
- An assembler directive: attach the label
labto the next instruction.
Compound operations, each equivalent to a comparison followed by a conditional jump:
JEQ lab, JLT lab, JGT lab, JLEQ lab, JGEQ lab, JNEQ lab
- Integer compare and branch;
JEQ labis equivalent to
EQ / JTRUE lab, etc.
JEQZ lab, JNEQZ lab, JLTZ lab, JGTZ lab, JLEQZ lab, JGEQZ lab
- Integer compare with zero and branch;
JEQZ labis equivalent to
CONST 0 / EQ / JTRUE lab, etc.
FJEQ lab, FJLT lab, FJGT lab, FJLEQ lab, FJGEQ lab, FJNEQ lab
- Floating point compare and branch;
FJEQ labis equivalent to
FEQ / JTRUE lab, etc.
DJEQ lab, DJLT lab, DJGT lab, DJLEQ lab, DJGEQ lab, DJNEQ lab
- Double-precision compare and branch;
DJEQ labis equivalent to
DEQ / JTRUE lab, etc.
QJEQ lab, QJLT lab, QJGT lab, QJLEQ lab, QJGEQ lab, QJNEQ lab
- Long integer compare and branch;
QJEQ labis equivalent to
QEQ / JTRUE lab, etc.
- Call a procedure with
narguments and no result.
- Call a procedure with
narguments and a one-word result.
These instructions expect to find the
n argument words on the evaluation stack, with the first argument nearest the top, and the static link and address of the procedure on top of them. First, the machine pushes the current program counter value as the return address, then pushes the current value of the base pointer register
bp as the dynamic link and resets
bp to point to the dynamic link. Finally, the program counter is set to the beginning of the procedure.
- Return from a procedure with no result.
- Return from a procedure with a one-word result.
These instructions reverse the frame creation steps described above, resetting the stack pointer, the base pointer, and the program counter to the values they have before the procedure was called. In addition, the
n words of arguments that were pushed before the
CALL instruction are popped from the stack.
- Pop an array bound B and check the index i now at the top of the stack. If the test 0 ≤ i < B fails, report an array bound error on line n.
- Check a pointer p on top of the stack. If it is null, report a null pointer error on line n.
- Check a static link p on top of the stack. If it is not null, report the error that a local procedure is being assigned to a procedure variable.
ZCHECK n, FZCHECK n, DZCHECK n, QZCHECK n
- Check a divisor d on top of the stack. If it is zero, report the error of division by zero.
ERROR e n
- Report runtime error e on line n. The code e should be one of those listed in function
messagein the file
These instructions perform tasks that correspond to specific constructs in Oberon. They may or may not help with the implementation of other languages.
- At the top of the stack are the address of a record descriptor, and below it the address of a record. Pop these, test whether the decriptor is the ancestor of the record at level n, and push a Boolean result.
JCASE n, CASEL lab
- The instruction
JCASE nmust be followed by
CASELinstructions. It pops an integer
ifrom the stack. If
0 <= i < n, the program continues at the corresponding label; otherwise, it continues with the instruction following the last following
- The instruction expects three integers x, lo and hi on the stack. If lo ≤ x ≤ hi, the program contiunes at label lab, otherwise, it continues with the next instruction as usual.
- This instruction expects two address
band an integer
non the stack. It copies a block of
nbytes from address
- This instruction expects to find on the stack the address of an open array parameter and a size in bytes. It allocates space in the stack frame to copy the parameter, then overwrites the parameter with its new address.
- Nested Oberon procedures expect their static link in a special register
LINKinstruction pops a word from the stack and puts it into the
SAVELINKinstruction saves the contents of the
SLregister at offset -4 from the base pointer.
LDEW n, STEW n
- Load and store in enclosing frame;
LDEW nis equivalent to
LDLW -4 / LDNW n, and
LDLW -4 / STNW n. These instructions perform the operations needed to load and store one-word quantities in the stack frame of the statically enclosing procedure by following one link in the static chain.
- The sequence
DUP 0 / CONST n / JGEQ labcan be abbreviated as
CONST n / TESTGEQ lab. This is used in translating CASE statements.
Special operations for the compilers course
To support some of the practicals, I've added a couple of special instructions. Details are given in the lab manual, but should be copied here.
Secret implementation-level instructions
Some of the advertised assembly-level instructions don't exist in the current implementation of Keiko, and are replaced by the assembler with combinations of other instructions, including some secret ones that exist in the implementation only for this purpose. These instructions should not be used in compiler output.
- This is a more compact version of the
CONSTinstruction, optimised for operands that are small integers.
- These instructions pop two floating-point or double-precision values, compare them, and push an integer, -1, 0 or +1, that indicates the ordering relationship between the operands.
PCALLW ninstruction is replaced by a pair
JPROC / SLIDE n+1or
JPROC / SLIDEW n+1. The
JPROCinstruction jumps to the procedure, which returns to the
SLIDEinstruction removes the procedure arguments from the stack; in the case of
SLIDEW, it preserves the result of the procedure as it does so.
RETURNWinstruction is actually equivalent to the pair
RESULTW / RETURN. The
RESULTWpreserves the procedure result in a special register, where it will be found be a subsequent
SLIDEWinstruction, and the
RETURNinstruction does the rest of the job of returning from the procedure.