Specification of Keiko (Compilers)
This page describes the assembly language of the Keiko machine that is used as a target for the compilers in the course. Some features are omitted if they are not used in the course.
Contents |
Program structure
A Keiko program begins with a header, which for our purposes always consists of the text,
MODULE Main 0 0 IMPORT Lib 0 ENDHDR
The header is followed by a sequence of definitions; each may be a procedure, a global variable, or a string constant.
Blank lines are ignored, as are lines that begin with the comment character !.
Procedures
A procedure begins with a PROC directive and ends with an END directive; between them appears the bytecode of the procedure.
-
PROC p n 0 0
- Begin a procedure named p that uses n bytes of space for local variables. (The two zeroes replace information that is used when the runtime system contains a garbage collector.)
-
END
- End a procedure.
Global variables and string constants
-
GLOVAR x n
- Define x as a global variable occupying n bytes of space. All global variables are allocated on a 4-byte boundary.
-
DEFINE x
- Define x as a label for the string constant that follows.
-
STRING hex
- Treat hex as a sequence of pairs of hexadecimal digits, each pair corresponding to one byte. Store the sequence of bytes as a string constant. Each
STRINGdirective generates exactly the number of bytes specified, so several such directives can be used next to each other to build up a long string. If the string is to be zero-terminated, then the last such directive should end with a zero byte, denoted by00.
Apart from the directives listed above, all other elements in a Keiko program are bytecode instructions that may appear only between a PROC directive and its matching END. The different instructions are listed below in groups according to their function; in each group, a basic set of instructions is listed separately from a set of compound instructions that can each be defined as equivalent to a specific sequence of basic instructions.
Note that there's no guarantee that an implementation of Keiko will actually provide all the instructions listed here. Some instructions may be implemented by a sequence of others: specifically, some of the less common instructions that are described as being compound may be expanded by the assembler into the equivalent sequence of basic instructions. Also, some other operations may be implemented by other 'secret' instructions; for example, floating point comparisons are implemented by a special FCMP instruction followed by an integer comparison.
Addressing, load and store
Basic instructions:
-
CONST n
- The constant n can be a decimal or hexadecimal constant; its value is pushed on the stack.
-
FCONST x, DCONST x, QCONST n
- Push a floating-point, double-precision or long-integer constant.
-
GLOBAL x
- Push the address of a global variable x (declared with a
GLOVARdirective).
-
LOCAL n
- Push the address of a local variable that is at offset n from the base pointer.
-
LOADC, LOADS, LOADW, LOADD
- Pop an address and push the contents of the (8, 16, 32, 64)-bit location it names.
-
STOREC, STORES, STOREW, STORED
- Pop an address and a one- or two-word value and store the value into the (8, 16, 32, 64)-bit location named by the address.
-
PLUSA
- Pop an address and an integer offset push their sum. This instruction behaves like integer addition, but is kept separate to help the JIT translator.
-
DUP n
- Duplicate the one-word item that is
nwords from the top of the stack – soDUP 0duplicates the top element on the stack.
-
SWAP
- Swap two one-word items at the top of the stack.
-
POP n
- Pop
none-word items from the stack.
Compound instructions:
-
LDLC n, LDLS n, LDLW n, LDLD n
- Load from local variable;
LDLs nis equivalent toLOCAL n / LOADs.
-
STLC n, STLS n, STLW n, STLD n
- Store to local variable;
STLs nis equivalent toLOCAL n / STOREs.
-
LDGC x, LDGS x, LDGW x, LDGD x
- Load from global variable;
LDGs xis equivalent toGLOBAL x / LOADs. Here x is a symbol that refers to an address in global storage, and the instruction loads from that address.
-
STGC x, STGS x, STGW x, STGD x
- Store to global variable;
STGs xis equivalent toGLOBAL x / STOREs.
-
INDEXC, INDEXS, INDEXW, INDEXD
- Scaled addition;
INDEXsis equivalent toPUSH n / TIMES / PLUSAwhere n is 1, 2, 4 or 8.
-
LDIC, LDIS, LDIW, LDID
- Indexed load;
LDIsis equivalent toINDEXs / LOADs.
-
LDNW n, STNW n
- Constant indexed load and store;
LDNW nis equivalent toPUSH n / PLUSA / LOADW, etc.
Integer arithmetic
All arithmetic instructions expect their operands on the evaluation stack and leave their result on the stack. The binary operations expect to find their right-hand operand on top of the stack and their left-hand operand just beneath it.
-
PLUS, MINUS, TIMES, DIV, MOD
- Binary arithmetic operations on integers. (
DIVandMODare guaranteed to give mathematically correct results).
-
UMINUS
- Unary minus on integers
-
EQ, LT, GT, LEQ, GEQ, NEQ
- Integer comparisons with a Boolean result
-
AND, OR, NOT
- Boolean operations. As in C, these treat any non-zero operand as true, and produce a result that is either 1 or 0.
-
BITAND, BITOR, BITXOR, BITNOT
- Bitwise boolean operations.
-
BIT
- Given a number between 0 and 31, this unary operation generates a word with the only corresponding bit set. The LSB is numbered 0 and the MSB is numbered 31.
-
LSL, LSR, ASR
- Shift instructions.
-
QPLUS, QMINUS, QTIMES, QUMINUS, QDIV, QMOD
- Long integer (64 bit) arithmetic operations.
-
QEQ, QLT, QGT, QLEQ, QGEQ, QNEQ
- Long integer (64 bit) comparisons.
Compound operations:
-
INC, DEC
- Increment and decrement;
INCis equivalent toCONST 1 / PLUS, etc.
-
INCL n, DECL n
- Increment or decrement local variable;
INCL nis equivalent toLDLW n / INC / STLW n, etc.
-
QINC, QDEC
- Long integer increment and decrement;
QINCis equivalent toQCONST 1 / QPLUS, etc.
-
BITSUB
- Bitwise difference, equivalent to
BITNOT / BITAND
Floating point
-
FPLUS, FMINUS, FTIMES, FDIV, FUMINUS
- Floating point arithmetic.
-
FEQ, FLT, FGT, FLEQ, FGEQ, FNEQ
- Floating-point comparisons with a Boolean result.
-
DPLUS, DMINUS, DTIMES, DDIV, DUMINUS
- Double-precision arithmetic.
-
DEQ, DLT, DGT, DLEQ, DGEQ, DNEQ
- Double-precision comparisons with a Boolean result.
Conversions
-
CONVNF
- Convert integer to floating point.
-
CONVND
- Convert integer to double precision.
-
CONVFD
- Convert floating point to double precision.
-
CONVDF
- Convert double precision to floating point.
-
CONVNC
- Convert integer to unsigned character, masking off unwanted bits.
-
CONVNS
- Convert integer to signed short, sign extending the result from 16 to 32 bits.
-
CONVNQ
- Convert integer to long integer
-
CONVQN
- Convert long integer to integer
-
CONVQD
- Convert long integer to double precision
Jumps
Labels in the assembler are written as decimal integers.
-
JUMP lab
- Jump to label
lab.
-
JUMPF lab, JUMPT lab
- Pop a Boolean value and jump to label
labif the Boolean is false or if it is true, respectively.
-
LABEL lab
- An assembler directive: attach the label
labto the next instruction.
Compound operations, each equivalent to a comparison followed by a conditional jump:
-
JEQ lab, JLT lab, JGT lab, JLEQ lab, JGEQ lab, JNEQ lab
- Integer compare and branch;
JEQ labis equivalent toEQ / JTRUE lab, etc.
-
JEQZ lab, JNEQZ lab, JLTZ lab, JGTZ lab, JLEQZ lab, JGEQZ lab
- Integer compare with zero and branch;
JEQZ labis equivalent toCONST 0 / EQ / JTRUE lab, etc.
-
FJEQ lab, FJLT lab, FJGT lab, FJLEQ lab, FJGEQ lab, FJNEQ lab
- Floating point compare and branch;
FJEQ labis equivalent toFEQ / JTRUE lab, etc.
-
DJEQ lab, DJLT lab, DJGT lab, DJLEQ lab, DJGEQ lab, DJNEQ lab
- Double-precision compare and branch;
DJEQ labis equivalent toDEQ / JTRUE lab, etc.
-
QJEQ lab, QJLT lab, QJGT lab, QJLEQ lab, QJGEQ lab, QJNEQ lab
- Long integer compare and branch;
QJEQ labis equivalent toQEQ / JTRUE lab, etc.
Procedure call
-
CALL n
- Call a procedure with
narguments and no result.
-
CALLW n
- Call a procedure with
narguments and a one-word result.
These instructions expect to find the n argument words on the evaluation stack, with the first argument nearest the top. (If the procedure has a static link, then that is treated as an extra argument by Keiko, and the extra argument is counted in n.) On top of the arguments is the address of the procedure.
Firdt, the machine pushes the current program counter value as the return address, then pushes the current value of the base pointer register bp as the dynamic link and resets bp to point to the dynamic link. Finally, the program counter is set to the beginning of the procedure.
-
RETURN
- Return from a procedure with no result.
-
RETURNW
- Return from a procedure with a one-word result.
These instructions reverse the frame creation steps described above, resetting the stack pointer, the base pointer, and the program counter to the values they have before the procedure was called. In addition, the n words of arguments that were pushed before the CALL instruction are popped from the stack.
Runtime checks
-
BOUND n
- Pop an array bound B and check the index i now at the top of the stack. If the test 0 ≤ i < B fails, report an array bound error on line n.
-
NCHECK n
- Check a pointer p on top of the stack. If it is null, report a null pointer error on line n.
-
GCHECK n
- Check a static link p on top of the stack. If it is not null, report the error that a local procedure is being assigned to a procedure variable.
-
ZCHECK n, FZCHECK n, DZCHECK n, QZCHECK n
- Check a divisor d on top of the stack. If it is zero, report the error of division by zero.
-
ERROR e n
- Report runtime error e on line n. The code e should be one of those listed in function
messagein the filexmain.c.
Language-specific operations
These instructions perform tasks that correspond to specific constructs in Oberon. They may or may not help with the implementation of other languages.
-
TYPETEST n
- At the top of the stack are the address of a record descriptor, and below it the address of a record. Pop these, test whether the decriptor is the ancestor of the record at level n, and push a Boolean result.
-
JCASE n, CASEL lab
- The instruction
JCASE nmust be followed bynof theCASELinstructions. It pops an integerifrom the stack. If0 <= i < n, the program continues at the corresponding label; otherwise, it continues with the instruction following the last followingCASELinstruction.
-
JRANGE lab
- The instruction expects three integers x, lo and hi on the stack. If lo ≤ x ≤ hi, the program contiunes at label lab, otherwise, it continues with the next instruction as usual.
-
FIXCOPY
- This instruction expects two address
aandband an integernon the stack. It copies a block ofnbytes from addressbto addressa.
-
FLEXCOPY
- This instruction expects to find on the stack the address of an open array parameter and a size in bytes. It allocates space in the stack frame to copy the parameter, then overwrites the parameter with its new address.
-
LINK, SAVELINK
- Nested Oberon procedures expect their static link in a special register
SL. TheLINKinstruction pops a word from the stack and puts it into theSLregister; theSAVELINKinstruction saves the contents of theSLregister at offset -4 from the base pointer.
-
LDEW n, STEW n
- Load and store in enclosing frame;
LDEW nis equivalent toLDLW -4 / LDNW n, andSTEW ntoLDLW -4 / STNW n. These instructions perform the operations needed to load and store one-word quantities in the stack frame of the statically enclosing procedure by following one link in the static chain.
-
TESTGEQ lab
- The sequence
DUP 0 / CONST n / JGEQ labcan be abbreviated asCONST n / TESTGEQ lab. This is used in translating CASE statements.
Special operations for the compilers course
To support some of the practicals, I've added a couple of special instructions. Details are given in the lab manual, but should be copied here.
Secret implementation-level instructions
Some of the advertised assembly-level instructions don't exist in the current implementation of Keiko, and are replaced by the assembler with combinations of other instructions, including some secret ones that exist in the implementation only for this purpose. These instructions should not be used in compiler output.
-
PUSH n
- This is a more compact version of the
CONSTinstruction, optimised for operands that are small integers.
-
FCMP, DCMP
- These instructions pop two floating-point or double-precision values, compare them, and push an integer, -1, 0 or +1, that indicates the ordering relationship between the operands.
-
JPROC -
SLIDE n -
SLIDEW n
- Each
CALL norCALLW ninstruction is replaced by a pairJPROC / SLIDEorJPROC / SLIDEW n. TheJPROCinstruction jumps to the procedure, which returns to theSLIDEinstruction. TheSLIDEinstruction removes the procedure arguments from the stack; in the case ofSLIDEQ, it preserves the result of the procedure as it does so.
-
RESULTW
- The
RETURNWinstruction is actually equivalent to the pairRESULTW / RETURN. TheRESULTWpreserves the procedure result in a special register, where it will be found be a subsequentSLIDEWinstruction, and theRETURNinstruction does the rest of the job of returning from the procedure.