Example program for OBC

Copyright © 2024 J. M. Spivey
Jump to navigation Jump to search

Here is a simple Oberon program:

MODULE Foo;

IMPORT Out;

PROCEDURE f(x: INTEGER): INTEGER;
  VAR y: INTEGER;
BEGIN
  y := x+1;
  RETURN y * y
END f;

BEGIN
  Out.String("f(7) = "); 
  Out.Int(f(7), 0);
  Out.Ln
END Foo.

And here is the Keiko code that is output by the Oberon compiler (version 2.7.2):

!! SYMFILE #Foo 0x00020702 #Foo.%main 1
!! END 0x464cf613
!! 
MODULE Foo 0x464cf613 0
IMPORT Out 0x35d7b86a
ENDHDR

PROC Foo.f 1 1 0x00020001
! PROCEDURE f(x: INTEGER): INTEGER;
!   y := x+1;
LDLW 12
INC
STLW -4
!   RETURN y * y
LDLW -4
LDLW -4
TIMES
RETURN
END

PROC Foo.%main 0 0 0x00020001
WORD Foo.%1
WORD Out.String
WORD Foo.f
WORD Out.Int
WORD Out.Ln
!   Out.String("f(7) = "); 
CONST 8
LDKW 0
LDKW 1
CALL 2
!   Out.Int(f(7), 0);
CONST 0
CONST 7
LDKW 2
CALLW 1
LDKW 3
CALL 2
!   Out.Ln
LDKW 4
CALL 0
RETURN
END

! String "f(7) = "
DEFINE Foo.%1
STRING 66283729203D2000

! End of file

Let's look at it line by line:

!! SYMFILE #Foo 0x00020702 #Foo.%main 1
!! END 0x464cf613

Lines 1–2 describe the interface of module |Foo|. They are treated as comments by the bytecode assembler, but would be read by the Oberon compiler when it compiled a module that imports this one. Since |Foo| does not export anything, the interface description is very short. It contains a version code 0x00020501 for the Oberon compiler, so that the compiler can check it is not reading files that are left over from an earlier version of the compiler. There is also a checksum 0x33d72171 for the interface, used to ensure that all modules that import |Foo| are recompiled if the interface of |Foo| changes.

MODULE Foo 0x464cf613 0
IMPORT Out 0x35d7b86a
ENDHDR

Lines 4–6 tell the linker that module |Foo| imports module |Out|, and give the interface checksums for both modules. This information allows the linker to check that the other modules that are imported by each module are loaded before it, and that the interface of a module is not changed without recompiling all modules that import it. Line 4 also gives the `line count' of |Foo| as 0, indicating that it is not compiled with instructions for line-count profiling.

PROC Foo.f 1 1 0x00020001
! PROCEDURE f(x: INTEGER): INTEGER;

Lines 8–19 are the code for procedure |f|; they include lines from the source code for |f| as comments. Line 8 is the beginning of a procedure named |Foo.f| that uses 4 bytes of space for local variables and has a `frame map' of 0. The space for local variables is allocated in the stack frame for the procedure as part of the process of calling it. The frame map tells the garbage collector which words of the stack frame may contain pointers into the heap; in this case, none of them do.

!   y := x+1;
LDLW 12
INC
STLW -4

Lines 11–13 are the code for |y := x+1|: first the value of |x| is pushed on the stack by loading it from offset 12 in the stack frame, then this value is incremented, and finally it is stored as the value of |y| at offset $-4$.

!   RETURN y * y
LDLW -4
LDLW -4
TIMES
RETURN
END

Lines 15–18 are the code for {\bf return} |y * y|: the value of |y| is pushed on the stack twice, then these two values are multiplied together, and the result becomes the value that is returned to the caller of |f|.

Evidently, this procedure body could be simplified. For example, it is not necessary to store the intermediate result into the stack frame at all, and the sequence {\vsf STLW -4/LDLW -4/LDLW -4/TIMES} could be replaced by {\vsf DUP/TIMES}. On a machine with a fast stack, this sequence would be much quicker. As bytecode, however, there is less advantage to be gained, as each of the instructions can be encoded as a single byte, and access to memory is just as fast as access to the (simulated) stack.

PROC Foo.%main 0 0 0x00020001
WORD Foo.%1
WORD Out.String
WORD Foo.f
WORD Out.Int
WORD Out.Ln

Lines 21–43 are the code for the main program, which is converted into a procedure named Foo.%main by the compiler; the linker arranges that all such procedures (corresponding to module bodies) are called in sequence when the program runs.

The body of this procedure consists of three calls to procedures exported by the module |Out|; one of these has an argument that is computed by calling |f|. In order to make these procedures accessible to this one, they are put into the constant pool for Foo.%main by listing them immediately after the PROC directive, together with the address Foo.%1 of the string constant |"f(7) = "|.

!   Out.String("f(7) = "); 
CONST 8
LDKW 0
LDKW 1
CALL 2

Lines 28–31 are the code for |Out.String("f(7) = ")|. When execution reaches CALL instruction, the following values are on the stack: \itemize \item The address and length of the string |"f(7) = "|. The length is 8 (counting the terminating null), and the address is the value of the symbol Foo.%1, pushed onto the stack by the instruction CONSTW 0. \item The code address for |Out.String|, pushed by the instruction CONSTW 1; this is actually the address of the descriptor for the procedure. \enditems After the procedure returns, all this information has disappeared from the stack.

!   Out.Int(f(7), 0);
CONST 0
CONST 7
LDKW 2
CALLW 1
LDKW 3
CALL 2

Lines 33–38 are code for the call to |Out.Int|, and within them at lines 34–36 is the code to call procedure |f|. Thus the constant 0 pushed on line 33 is the second parameter for the call |Out.Int(f(7), 0)|; this remains on the stack throughout the evaluation of |f(7)|. Since |f| returns a result, its call on line 36 uses a CALLW instruction, causing a one-word result to be retrieved from the stack frame of the call and put onto the evaluation stack of the calling procedure.

!   Out.Ln
LDKW 4
CALL 0
RETURN
END

The main program ends with a call to Out.Ln.

! String "f(7) = "
DEFINE Foo.%1
STRING 66283729203D2000

Lines 46–7 define the symbol Foo.%1 as the address of a string constant. The characters of this string are given in hexadecimal, so that neither the compiler nor the assembler need deal with the problem of escaping special characters in strings.