How to add primitives to OBC

Copyright © 2024 J. M. Spivey
Jump to navigation Jump to search

The Oxford Oberon-2 compiler uses a runtime system that either interprets bytecodes or dynamically translates them to machine code. It allows library procedures that are written in C, and that is how much of the standard library is implemented. It's possible to add your own C-language primitives and link them into a specialised runtime system, or to make a shared library and get the standard runtime system to link to it.

To add a primitive, such as the unix function usleep, it is first necessary to declare it, giving the types of the arguments and the result as with any procedure. This declaration can be put in any module, either the one where it is used, or perhaps a module that collects together the declarations of all the extra primitives that are used in the program.

PROCEDURE usleep(t: INTEGER) IS "usleep";

The name given as a string is the C function corresponding to the primitive.

Value parameters of type CHAR, BOOLEAN, SHORTINT, INTEGER, LONGINT, REAL and LONGREAL correspond naturally with parameters in C. Parameters of aggregate or pointer type and other VAR parameters are passed as pointers. In the case of open array parameters, only a pointer to the elements is passed to the C function, and not the bounds – but you can always write prim(a, LEN(a)) to pass the bound explicitly.

Having declared the primitive, we must either compile the program in a special way (Method A) or, if the runtime system was built to include the Foreign Function Interface library libffi, just run the program and have it find the primitive dynamically (Method B). These notes assume a reasonably standard Linux environment, or Mac OS X with occasional differences in the commands. On Windows, OBC is built by using Cygwin to produce a binary that will run on plain Windows; Method A could be made to work inside Cygwin, but Method B will not work. The standard primitives that are part of the runtime system are all called using Method A, for speed and also to support systems where libffi is not avaialable.

In Release 3.1 and later

Method A

Compile the program using the special flag -C at link time:

$ obc -c Think.m
$ obc -C -o think Think.k

(or combine the two steps into one with obc -C -o think Think.m.) This will create an executable called 'think'. It's quite a bit bigger than typical executables built with OBC because it contains (thanks to the flag -C) a custom version of the runtime system with the new primitive linked into it.

On a platform that supports dynamic linking sufficiently well, that mechanism is used to find the primitive when procedure usleep is first called, though in this instance the primitive is statically linked into the runtime system. OBC will also work on other platforms that do not support dynamic linking, and there the OBC linker compiles a table of primitives in advance that is used to find them at runtime.

The steps above work for calling a C function that is part of the standard library. To add C functions of your own, say the function mysqrt in file mylib.c, just declare them in the same way:

PROCEDURE sqrt(x: REAL): REAL IS "mysqrt";

then name the C file on the OBC command line:

$ obc -C -o main Main.m mylib.c

Your file of C code will be compiled with the system C compiler and linked into the custom runtime system.

Method B

On systems that support libffi, you can just compile the program as normal, and the adapter stub that is needed to link the Oberon world with the C world will be generated dynamically when the program runs. This works both for the bytecode interpreter and for the JIT implementation.

To add your own C code, you can compile it into a DLL, then dynamically load the DLL from the initialisation block of any module. Use the procedure SYSTEM.LOADLIB:

SYSTEM.LOADLIB("./mylib.so")

Then compile your C code and the Oberon program:

$ obc -o main Main.m
$ gcc -fPIC -shared mylib.c -o mylib.so

On Mac OS X, the command line is a bit different (untested):

$ gcc -fPIC -bundle -undefined dynamic_lookup prim.c -o prim.so

The resulting executable and DLL together are much smaller than the single executable under Method A, because they don't contain a copy of the entire OBC runtime system.

In Release 3.0

The compiler in release 3.0 is not able to generate the stub functions for itself, so a manual method is needed.

Method A: static linking

Here's how to add a primitive to an Oberon program built with OBC by making a specialised runtime system. There are three parts: writing the primitive in C, declaring the primitive in Oberon, and writing a client program that uses the primitive. For generality, I will separate the three parts into three source files – though it's possible to declare a primitive and use it in the same Oberon module. The OBC sources have the C code and the Oberon code in the same file, with the C code embedded in Oberon comments, and there's a script that extracts the C code before it is compiled. But that process is just too complicated to describe here.

First, we write the following in a file prim.c.

#include <unistd.h>
#include "obx.h"

PRIMDEF void Sleep_Usec(value *bp) {
     usleep(bp[HEAD+0].i);
     ob_res.i = 42;
}

This defines a function Sleep_Usec that expects a pointer into the Oberon stack; it retrieves an integer with the expression bp[HEAD+0].i and passes that to the Unix system call usleep(). For the sake of the demonstration, it then returns the value 42 by assigning to a global variable ob_res. Both this variable and the stack slot bp[HEAD+0] have the union type value defined in the header file obx.h, with alternatives for floating point and string pointers among others. It's a shame that the adapter stub has to be written by hand, but there is no automated way of doing it at present.

Now write the following in a file 'Sleep.m':

MODULE Sleep;

PROCEDURE Usec*(usec: INTEGER): INTEGER IS "Sleep_Usec";

END Sleep.

This just declares and exports a procedure Usec; but instead of a body for the procedure, the phrase IS "Sleep_Usec" appears, associating the Oberon procedure with the C primitive defined earlier. It's possible, of course, to put C code for many primitives in the same file, and to declare them all in the same Oberon module.

Lastly, write the following in file 'Main.m':

MODULE Main;

IMPORT Sleep, Out, Files;

VAR ans: INTEGER;

BEGIN
  Out.String("Thinking..."); Files.Flush(Files.stdout);
  ans := Sleep.Usec(2000000);
  Out.String("done"); Out.Ln;
  Out.String("The answer is "); Out.Int(ans, 0); Out.Ln  
END Main.

As you'll see, this imports the Sleep module we just defined and uses the Usec procedure just as if that procedure had been written in Oberon.

To compile all parts and link them together, use the following Unix command:

$ obc -C -o think Sleep.m Main.m prim.c

This will create an executable called 'think'. It's quite a bit bigger than typical executables built with OBC because it contains (thanks to the flag -C) a custom version of the runtime system with the new primitive linked into it. You can get the same result by compiling the pieces separately and then linking them together.

$ obc -c Sleep.m
$ obc -c Main.m
$ obc -c prim.c
$ obc -C -o think Sleep.k Main.k prim.o

Naturally, the Sleep module must be compiled and linked before the Main module that uses it.

This method works unchanged on Mac OS X.

On a platform that supports dynamic linking sufficiently well, that mechanism is used to find the primitive when procedure Sleep is first called, though in this instance the primitive is statically linked into the runtime system. OBC will also work on other platforms that do not support dynamic linking, and there the OBC linker compiles a table of primitives in advance that is used to find them at runtime.

Method B: Dynamic linking

A much smaller executable can be obtained by making the new primitive into a dynamic library that can be loaded by the standard runtime system. This method is less well supported by the obc script, and requires us to invoke the C compiler directly. We can leave the files prims.c and Main.m unchanged, but we need to rewrite the module Sleep.m so that it loads the appropriate dynamic library when the program starts.

MODULE Sleep;

IMPORT DynLink;

PROCEDURE Usec*(usec: INTEGER): INTEGER IS "Sleep_Usec";

BEGIN
  DynLink.Load("./prim.so")
END Sleep.

The program will load the dynamic library when the Sleep module is initialised, and will find the entry Sleep_Usec when the procedure Sleep.Usec is first invoked.

To compile this program, we must first create a shared library containing the C-language primitive.[1]

$ gcc -m32 -fPIC -shared -I /usr/local/lib/obc prim.c -o prim.so

The -m32 is required when building on an amd64 machine, as the runtime system is 32-bit code; it can be omitted on i386 or RPi. On Mac OS X, the command line is a bit different:

$ gcc -m32 -fPIC -bundle -undefined dynamic_lookup -I /usr/local/lib/obc prim.c -o prim.so

Before or after building the shared object, we can compile and link the Oberon parts on their own (in one command or several).

$ obc -o think Sleep.m Main.m

The two files think and prim.so add up to about 10K together, but depend on the shared JIT-based runtime system from /usr/local/lib/obc/obxj, whereas the statically linked think from the previous section (which embeds substantially all of obxj) comes to more than 300K.


  1. Depending on where OBC is installed, you might need to replace /usr/local with /usr in these commands.