Native code from Oberon

Copyright © 2024 J. M. Spivey
Jump to navigation Jump to search

I have an Oberon compiler that targets a portable bytecode, and a native-code compiler for a small Pascal-like language that uses a similar language of instructions to the bytecode for its tree-based IR. It would be nice to combine the two to make a native code compiler for Oberon, targeting (perhaps) the ARM or maybe the Thumb variant of ARM.

Actually, there's the native code compiler used for the course, and also an extended, multi-target compiler that contains partial solutions for at least some of the issues listed below.

Issues

  • Garbage collection
    • The front end must provide information that lets the back end produce a map of where pointers are located at runtime, including the temps that are live at each call site.
    • If pointer temps can live only in callee-save registers across a call, we can simplify matters by providing a register map at each call site, and making the callee save (all) registers in a fixed layout.
    • Or we could use the Boehm collector, or no collector at all.
  • Floating point
    • Floating point registers are disjoint from integer registers, so the naive treatment of register allocation in the native compiler will have to be extended a bit. Mostly we can use context to decide which registers to use for each expression, but we'll have to deal with the embarrassing situation where a value is in a float register but needed in an integer register, or vice-versa.
    • We'll need to respect the calling convention of the target if float arguments are supposed to be passed in floating-point registers.
  • Short-cicuit conditions and method calls
    • The source language may require all boolean expressions to be short-circuit, regardless of context; implementing this may entail allowing straight-line temps at an earlier stage than at present.
    • Method calls use the receiver twice: once for the method lookup, and again for the self parameter. This can also be implemented using a temp, but it's tidier if the temp is allocated the first argument register. Perhaps its easiest to have a special tree op for this.
    • Also bounds checks for dynamically allocated arrays – look for places in the postfix code generator where DUP is used.