Tenth

Copyright © 2024 J. M. Spivey
Jump to navigation Jump to search

I don't know what terminology to use ('direct threaded' or 'indirect threaded' or 'token threaded'), but execution in Tenth work like this:

  • The body of a defined word is (primarily) a sequence of 16-bit tokens, each being the unique representation of a Tenth word.
  • From such a token, we can find the address of the word's definition by splitting the token into two parts: a segment number (may be a single bit) and an offset. At the minimum, there are two segments, one pre-compiled and resident in ROM, and the other in RAM to allow for dynamic definitions; there's a small page table that gives the fixed base addresses of the segments. The offset is multiplied by 4 and added onto one of these base addresses. With two segments, that leaves 15 bits for the offset, which can be used to address 128kB of ROM and 128kB of RAM.
  • At a fixed offset in the definition of a word is its action, and address that the inner interpreter jumps to when executing the word.
    • Some words have actions written in machine code. Such actions are able to access additional halfwords from the body and adjust machine registers: that;
    • Others are defined words, for which there is an action enter that saves the ip in the R-stack and sets it to point to the body;
    • Others are primitives implemented as subroutines, for which there is an action call that invokes the routine.
  • The definition of each word contains a 32-bit data field that gives the address of the body in the case of defined words, and the address of the subroutine in the case of primitives.
  • Definitions are chained together to allow finding words from their names. This can be done by letting each definition contain the 16-bit token for the next definition.
  • Token zero naturally corresponds to the action e_n_d that returns from a word to its caller by popping the R-stack. Then the bodies of defined words can be zero-terminated and that has the expected effect. That means that page 0 will be the ROM in a system that has definitions preloaded.
  • Putting definition headers as well as bodies in ROM means that these definitions cannot be changed at runtime. Perhaps a new kind of word with a hook in RAM is needed. The existing mechanism for allocating RAM variables durin bootstrapping is a bit clunky.
  • In RAM, def headers, strings, and body code will be intermingled. That doesn't have to happen in ROM if it's convenient to have several regions with different alignment constraints. Only the region containing the def headers need be accessible via tokens.

Bootstrapping

As much as possible of any application should be written in Tenth and preloaded. That means running Tenth on the host in order to load definitions. Two options:

  • Use emulation and the interpreter core that is written in Thumb assembly language.
  • Use a portable core (with the 'big switch'), and an enumeration for the actions in place of the machine code addresses.

In either case, a dumping routine is needed that recovers enough symbolic information to create source code for the ROM image.