Code for y := x.f on various machines (Compilers)

Copyright © 2024 J. M. Spivey
Jump to navigation Jump to search

This program ptr.p is in picoPascal:

type ptr = pointer to record d, e, f, g: integer end;

var x: ptr; y: integer;

begin
  y := x^.f
end.

It declares x as a pointer to a record with four integer fields. Here is the Keiko code compiled for the assignment y := x^.f using the command ppc ptr.p, without optimisation:

GLOBAL _x
LOADW
CONST 8
OFFSET
LOADW
GLOBAL _y
STOREW

As we'll explore in coming weeks, the GLOBAL _x instruction fetches the address of global variable x, and the following LOADW fetches its value, the address of a heap-allocated record. Next in the program, the constant 8 is added to get the address of the field x^.f. Another LOADW fetches the contents of that field, and the final GLOBAL _y and STOREW store the value into the global variable y. Important to note is that the compiler has worked out the layout of the record and translated the reference to field f into the offset 8.

Turning on the peephole optimiser in the compiler with ppc -O rec.p results in more compact code:

LDGW _x
LDNW 8
STGW _y

Here LDGW _x has the same effect as GLOBAL _x; LOADW, the instrution LDNW 8 is short for CONST 8; OFFSET; LOADW, and STGW _y stands for GLOBAL _y; STOREW. These instructions can be executed by the bytecode interpreter in three cycles rather than seven.

We can translate the program using the compiler from Lab 4 and the command ppc -O2 rec.p to get code for the ARM:

ldr r0, =_x        @ GLOBAL _x
ldr r0, [r0]       @ LOADW
ldr r0, [r0, #8]   @ CONST 8; OFFSET; LOADW
ldr r1, =_y        @ GLOBAL _y
str r0, [r1]       @ STOREW

I've shown the correspondence with Keiko instructions as comments. What we see here is that the ARM back end chooses intructions that cover combinations of one or more Keiko instructions, beginning with something like the unoptimised version where each individual action is specified separately. The two "ldr =" instructions each set a register to the value of a global label: they're implemented as load-register instructions that implicitly use addressing relative to the program counter to get round the fact that full 32-bit constants don't fit in a single ARM instruction.

Compiling a similar program for the JVM gives significantly different results. Here is the Java program rec.java:

class rec {
    public int d, e, f, g;

    public static rec x;
    public static int y;

    public void m() {
        y = x.f;
    }
}

Using javac to compile the program and javap -c to disassemble it, we get the following code for the method m:

getstatic x:Lrec;
getfield f:I
putstatic y:I

Because the JVM is a higher-level virtual machine than Keiko, the Java compiler has not replaced the reference to f with an offset, but leaves the name f explicit in the object code. It's for the interpreter or translator implementing the JVM to decide how it wants to lay out objects of class rec. That gives it more freedom but also more responsibility, and also makes the JVM less flexible: instead of any data structure that can be expressed in terms of address arithmetic, the JVM is restricted to the kinds of object that can be expressed in Java.

Effect of the instruction ldr r0, =_x

In the assembly language program, _x is a constant, the address of a piece of storage that is allocated by the directive

.comm _x, 4, 4

The program is compiled on the assumption that this address can be an arbitrary 32-bit constant. The instruction

ldr r0, =_x

moves the value of this constant into register r0. For our purposes, we can take it for granted that it does this, but you may like to know that the ARM assembler gathers all the constants _x, _y, _z from each procedure in the program and makes them into a literal table that it then accesses using PC-relative addressing. The literal table for a procedure is placed at the .pool directive that the compiler inserts at the end of the procedure.

In order to get the value of variable x, the ldr = instruction must be followed by another that uses the address to get the value:

ldr r0, =_x
ldr r1, [r0]

It seems wasteful to use two instructions for a simple access to a global variable, but global variable access is (or ought to be) fairly rare, and it often occurs in clusters. For example, the two references to y in y := y+1 can be compiled into three instructions by using the address twice:

ldr r0, =_y
ldr r1, [r0]
add r2, r1, #1
str r2, [r0]

Sharing values in this way will be a crucial part of our compiler back-end.