Targeting the MIPS
The Omega2 series of IoT devices have a MIPS32 processor in an attractively small and lightweight package, running a minimal version of Linux called OpenWRT that is often used as firmware for wireless routers. Perhaps the existence of this convenient platform will provide the impetus to make a compiler targetting the MIPS.
- You could retarget the Lab 4 compiler from the Compilers course, aiming to adapt the existing code so as much of it as possible can be shared by back ends for the ARM32 and the MIPS. A chapter of the (otherwise almost impenetrable) book, A retargetable C compiler: design and implementation by Fraser and Hanson, gives helpful information and a machine grammar for the MIPS.
- Alternatively, you could work on a JIT translator for Keiko (as used for the Oxford Oberon Compiler) along the lines of the existing JITs for x86/amd64 and ARM32. This idea might be complicated by the lack of a floating point unit on the Omega2, but it would be possible to copy what is done (via subroutine calls or emulation) by the Omega2 C compiler. The Keiko JIT has a portability interface based on assembly code for an invented, idealised RISC machine that fits well with the MIPS, so the task ought to be manageable. The work would include writing a dynamic assembler that would generate MIPS instructions in binary format, needing a result correct down to the last bit.
Work on either of these options would be aided by the existence of a MIPS emulator as part of QEMU, and a GCC-based cross compiler that runs under Linux on x86 machines. A debugging build using QEMU of the Keiko JIT for ARM already exists and can provide a guide.
All of the above could apply to the AARM64 architecture too, if we can find a stable hardware platform to act as a motivation. The RPi would do, given a usable Linux port that runs in 64 bit mode.
Keiko on MIPS
It's possible to build the Keiko bytecode interpreter under Debian/x86 using a MIPS cross-compiler, then run it using QEMU. It's also possible to install the OpenWRT build environment and use its cross-compiler to generate the flavour of MIPS code needed by the Omega2 and link with the appropriate C library. The resulting interpreter also runs on QEMU, and can be copied to the Omega2 and runs there. Thirdly, it's possible to build OBC on the Omega2+, after installing an SD card and building prerequisite software from source.
Generic MIPS build of Keiko
- Use Debian and install the packages
mipselfor the little-endian variant of MIPS, which is the one used in the Omega2.)
- Clone the OBC-3 repository.
- Configure and build OBC, then change to the
mipstestdirectory and build there. The final step of the build is to run a small test case under QEMU.
(Don't bother with this.)
- Install the OpenWRT toolchain using instructions here: https://github.com/OnionIoT/Onion-Docs/blob/master/Omega2/Documentation/Doing-Stuff/Advanced/Cross-Compile.md
- Then you need to make some symlinks and use a modified
- The resulting executable
obx-mipswill run under QEMU, and can also be copied (together with
Fac.x) to the Omega2.
chmod +x obx-mipsthen
./obx-mips ./Fac.xto run it.
Building on Omega2
- Use an Omega2+ with the 128MB RAM and the SD card slot. A 4GB SD card will be big enough.
- Migrate the root file system onto the SD card: https://docs.onion.io/omega2-docs/boot-from-external-storage.html
- Enable more package repositories, and install the C compiler and Make.
- Install the following from source:
- TCL (needed to build Keiko). Disable all add-on packages by removing the contents of the
pkgsdirectory in the source.
- Clone the OBC repo, configure with
--disable-debuggerand build. Be patient!
Outline of a thunder back-end
- Define symbolic names for the machine's registers, naming them according to assembly language conventions. On RPi, that means
R10, plus special names for
PC, etc. On the MIPS, different classes of register have different naming schemes such as A0--A3, T0--T7, etc. Each symbolic name is a macro that expands to the relevant numeric value used in instruction encodings.
- Establish the arrays of
vm_regstructures that advertise available registers to the client. The Keiko translator rarely finds a use for more than half a dozen integer registers, and uses only one register as callee-save, so there's no need to provide access to every register if there are 32 of them.
- For debugging, provide an array of register names, and routines that are capable of formatting addresses in various modes.
- Define macros
MNEM, etc., that are defined conditionally, and permit opcodes to be paired with a string for debugging. When debugging is turned off, the strings are silently removed during macro expansion. There are some tricks here related to the timing of macro expansion.
- Define function
op_rrr, etc., that both print the assembly language equivalent of an instruction and add it in binary form to the program. There will be one of these for each way of laying out the binary fields of an instruction: put another way, two instructions can share the same formatting routine if they differ only in the numeric value of the fixed parts of the instruction containing the opcode. Each of these routines starts with a call to
vm_debug2and ends with a call to
vm_done. All code will be generated via one of these functions, so that for debugging we can generate a complete listing of the output code.
- If, like the Pi, large constants are accessed by loading from an out-of-line table, then include here routines for managing the table. On a machine that develops large constants in a couple of instructions (
ori), there will be no need for such a table.
- Routines that perform common functions in the way appropriate to the size of embedded constants: e.g., a routine
move_immedthat puts an arbitrary constant in a specified register, and
const_regthat puts it in a scratch register such as
ipon ARM or
R1on the MIPS. You can assume that
const_regwon't be used twice in translating one instruction.
- Routines like
arith_signedthat implement an operation with operands from a register and an arbitrary signed constant; this will put the constant in a register if it needs to do so, or will embed it an immediate field if it can.
- Other routines that implement, e.g., loads and stores with unbounded offsets, using a scratch register if needed for a large offset.
- Perhaps a register map that keeps track of what registers are used in a routine, to avoid saving registers that aren't used. This may not be needed for the MIPS, because there are relatively few callee-save registers anyway.
- Implementation of the code generation interface. This consists of a family of routines with names like
gen3rri– that one is for a three operand instruction where the operands are two registers and an integer constant. The heading of
void gen3rri(operation op, vmreg rega, vmreg regb, int c);
- Inside each such routine, first access the fields
rega->vr_regto find the actual register numbers.
vm_debug1to (optionally) print the Keiko instruction being assembled.
vm_space(0)to ensure space for an instruction. Using
0as the argument makes sure of at least 32 bytes of space, which ought to be enough.
- Next comes a big switch on the opcode, with each branch assembling a different kind of instruction, using the utility routines defined earlier to avoid repeated checking for special cases. Each arm of the switch should be a couple of lines at most.
- Give the switch a default part so unimplemented instructions are caught, not silently ignored.
- That completes each code generation routine.
- There are routines
vm_postludeto generate the top and bottom of each function.
- There's an additional routine
vm_printthat prints a target machine instruction in binary form (typically printing one word in hex in RISC machines), then returns the number of bytes printed. When debugging prints are on, the
vm_doneroutine uses this to print the object code.
Note that, though they provide the interface for client code, the routines like
vm_gen3rri are typically not called directly, but via the overload resolution code that appears at the end of
vm.h. This allows the client to contain calls to an overloaded routine
vm_gen, and calls the appropriate specific routine according to the number and types of the arguments. It uses a C11 (?) feature
_Generic that is sparsely documented, as well as varargs macros for good measure. It's horrible, but I like it.
- The architecture manual describes the MIPS instruction set. Apparently, the Omega2 implements release 2 of the MIPS32 instruction set, but without the floating point instructions.
- A page describing the soft float library routines that come with GCC.
- The book See MIPS Run is a valuable guide to all things MIPS (LMGTFY).
A compiler that is embedded in the execution environment for a virtual machine, and translates code for the virtual machine into native machine code just before each procedure is called. After translation, the native code remains in memory and can be used if the same procedure is called again, but the code is thrown away when the program terminates. Either each procedure (or some other fragment of the program) is translated when it is called for the first time, or an interpreter and a translator co-exist, and procedures are translated if they have been called more than a few times.
A computer designed with a simplified instruction set. Typically, these machines have a large set of uniform registers, a small set of addressing modes, and load/store instructions separate from the instructions that carry out arithmetic operations.
A computer with byte-addressed memory is little-endian if the least significant byte of a multi-byte integer in memory is the one with the lowest address, the same as the address of the word itself. Thus on the ARM as conventionally configured, if the integer 0x1a2b3c4d is stored at address 0x1000, then byte 0x1000 in the memory contains 0x4d, byte 0x1001 contains 0x3c, byte 0x1002 contains 0x2b, and byte 0x1003 contains 0x1a. At first, this seems counter-intuitive, until you realise that the real problem is the way we write numbers in everyday life, with the digit worth 100 at the end, and the one worth 10n-1 at the beginning.
A symbolic representation of the machine code for a program.