FAQ archive (Compilers)

From Spivey's Corner
Jump to: navigation, search

How would one go about implementing pass by value for structures (records). Also, how would it be possible to return by value a structure?

Like arrays passed by value, records can be passed by value by having the caller pretend they are being passed by reference, then making the subroutine prolog copy them into the stack frame. The Keiko machine has a FIXCOPY instruction designed especially for this purpose. Records, like arrays, cannot on typical machines be the direct result of a procedure, so compilers for languages that allow records as results typically transform the program so that the caller allocates space for the result, then passes the address of this space as an extra, hidden, parameter. The alternative, used in Java-like languages, is to say that values of array or record type are implicitly pointers to heap-allocated data, so that passing a record or array by value or returning it as a result simply passes the address of some storage with indefinite lifetime.

Why does the compiler generate non-short-circuit code for boolean expressions in statements like b := (i < max) & (a[i] = x)?

There's no deep reason, just that the implementation of gen_expr does nothing special with ANDs and ORs, so that the translation of the statement shown is

LDGW _i / LDGW _max / LT / (load a[i]) / LDGW _x / EQ / AND

where the LT, EQ, AND instructions, like PLUS, pop two values from the stack and push the result of a binary operation. The whole expression is evaluated whatever the truth value of the first comparison. More care is taken in the code generator of my Oberon compiler, which you can find on another page.

How long does the peephole optimiser take to terminate?

A preliminary question is, 'Does the peephole optimiser always terminate anyway?' Provided each rule replaces a group of instructions by a shorter group, the program shrinks at each step, and it cannot go on shrinking forever. In effect, the length of the program is a bound function for the optimiser loop.

The optimisation process is divided into scans, with each scan working forward through the program, applying rules as often as possible at each instruction before moving on to the next instruction. The only way a scan can leave possible rewrites that can be done in the next scan is if a rewrite at one point in the program enables another overlapping rewrite at a slightly earlier place in the program. For a given set of rules, we could find the length of the longest pattern, and make the peephole optimiser terminate in only one scan by moving back that far after each rule application. But it's simpler just to run another scan after any scan that makes changes, and in practice we see that the process terminates after two or three scans at most.

The story is complicated a bit by jumps, because something like


can be optimised by deleting the instruction LABEL L3 and making labels L3 and L4 equivalent. After that, it becomes apparent that the JUMP L3 instruction leads to the next instruction and can be deleted; but this would only become apparent in the next scan.

What happens if two rules match in the peephole optimiser but yield different results?

It's possible that two left-hand sides match the same instruction sequence, in which case the conventions of OCaml determine that the first rule listed is the one that's applied. But it's also possible that two rules match non-overlapping portions of the instruction sequence, and the rule that applies depends on the wrapper that systematically tries all rules at all locations in the sequence of instructions. In this circumstance, we'd rather not have a situation where the result of the optimiser depends on the detailed scheduling of the rules.

This kind of thing can happen with the instruction sequence


which loads a word from an address calculated by adding the offset 12 to address 24 in the current stack frame. This instruction sequence can be made more compact by using optimiser rules to substitute other instructions. The rule

LOCAL a / CONST n / OFFSET --> LOCAL (a+n)

folds the addition of the offset into the LOCAL instruction, giving


Then the rule


can introduce an LDLW instruction as an abbreviation for the LOCAL / LOADW pair, giving the single instruction LDLW 36.

The same instruction sequence can be rewritten in another way, because there's a rule


that introduces the compact LDNW instruction, often used to extract a field from a record, given the address of the record. This yields the sequence

LOCAL 24 / LDNW 12

that cannot be simplified further, unless we add the additional rule

LOCAL a / LDNW n --> LDLW (a+n)

to deal with exactly this situation.

The whole problem is similar to one that arises in automated theorem proving, where the slang term is 'hacking around Knuth--Bendix'. The name refers to a classic paper by Donald Knuth and Peter Bendix entitled 'Simple word problems in universal algebras' that is probably easiest to find in Knuth's volume, Selected Papers on Design of Algorithms.

How does the peephole optimizer work on Keiko tree as opposed to the old way of sequences of Keiko?

For optrees, the peephole optimiser naturally splits into two parts: a jump optimiser that looks at the jumps and labels appearing at the roots of the trees, and a separate simplifier that walks over each tree separately. The jump optimiser can look for the same patterns of jumps and labels as before, and implement rules like

JNEQ a / JUMP b / LABEL a --> JEQ b / LABEL a.

The simplifier can match patterns like

<BINOP Plus, t, <CONST 0>> --> t

(Look in modules Jumpopt and Simp in lab 4.)

What is a scratch register?

A register that is used only within the translation of a small fragment of code, perhaps a single operation.

What is the base pointer (and what's the purpose of having it), if the stack pointer points to the bottom of the stack, and the frame pointer points to the current frame?

The base pointer is the same as the frame pointer – different machines use different names. The ARM has a register fp that is used for this purpose. I don't believe I used the term 'base pointer' in this year's materials.

ldrb loads a byte into a register. But ARM registers are 32 bit and need 4 bytes? Is there an extra implicit command to splice the word into a byte? Could we optimize an array of characters by fetching 4 characters at a time?

The ldrb instruction puts the loaded byte in the least significant end of the destination register, and fills the rest with three zero bytes. Yes, you can fetch four characters at once, but then you need to shift and mask to access them individually. Several ldrb instructions will not be much slower, provided the word stays in the cache as it likely will. For well-aligned strings, you can copy them quickly by doing it a word at a time, but looking for the terminating zero byte in C style strings becomes a challenge.

Instructions are presumably stored in the RAM. But fetching from RAM takes potentially hundreds of cycles. How does the fetcher keep up with the processor's voracious appetite for instructions?

By using a cache for instructions as well as data: sometimes two separate caches, sometimes one cache that can support twice the data rate.

When we compile Keiko to ARM, is the only use of the evaluation stack (the space between fp and sp) for when we run out of registers?

No, space for local variables is allocated below fp and adressed at negative offsets from fp. Also, outgoing arguments beyond the first four are put in storage addressed at positive offsets from sp.

How on earth can a JIT compiler be faster than the Keiko virtual machine in the labs. Considering that a compiler takes a while to do the parsing, making the abstract tree, optimizing, converting to machine code, eventually run the machine code. But JIT compilers seem to be so much faster?

The JIT starts from the bytecode, which is effectively an IR, and the time to decode it for translation (once) can be compared with the time needed to decode it for execution (possibly many times). Different JITs have different degrees of ambition as regards the quality of native code they output. Mine is very simple, practically translating each bytecode individually, but the Java JITs can tune the amount of optimisation they do, sometimes retranslating code more carefully if it turns out to be heavily used.

I often get told that the stack is magically faster than the heap. Is stuff stored in the stack inherently faster to load/store from than the heap? Or does it just happen to allocated to registers more than the heap?

Access to heap storage may involve an extra indirection compared with storage in the local stack frame; but more importantly, heap-allocated storage must, in the general case, be recycled (if it is recycled at all) by garbage collection, not by the natural contraction of the subroutine stack when a subroutine returns.

How are interfaces implemented in a compiler? In classes, the methods are always going to be at a fixed offset in the vtable. But for interfaces, how do we know the correct address of the method?

Do you mean interfaces in the Java sense? If so, then there are various methods of implementing dynamic dispatch. Some of them involve hash tables, with collisions often resolved by generating code on the fly. Interface method call will always be potentially a bit more expensive than calling a method in the ordinary way.


Could you recommend a document which summarises the ARM instructions we commonly use? (Akin to the "Specification Of Keiko" document)

This set of slides is not bad: http://simplemachines.it/doc/arm_inst.pdf – provided you realise that an ordinary program runs entirely in 'user mode' and the other modes are only interesting to those implementing an operating system. So as far as we are concerned, the ARM has 16 registers plus the status flags, not 37 or whatever. Also, we won't be exploiting conditional instructions like addeq – which preforms an addition conditional on two values being equal – but only conditional branches like beq. So it's all a bit simpler than suggested in those slides.

A hastily put together guide to the subset we use can be found on another page here.

Are some of the diagrams of optrees in the notes mislabelled?

Yes, they are. In particular, the registers used in the code that is derived from a tiling may not always agree with the registers shown on the diagram. Too much done by hand to get the details right! Sorry.

How could we go about changing our compiler to instead be an interpreter?

There's a whole range of possibilities, since nearly every language implementation does some preprocessing of the program text, followed by run-time interpretation of the results of that preprocessing, so in a sense every implementation involves an interpreter to some extent.

  • For Keiko, we translate the program text into [Keiko assembly language, then that gets assembled into] bytecode, and there's a run-time interpreter for the bytecode. AFAIK, the standard implementation of Python (for example) is rather like this: the program text is internally translated into code for some kind of abstract machine that is then interpreted. The abstract machine code can be saved in a .pyc file to speed up subsequent runs.
  • For a machine language compiler, the program text is translated into [assembly language which is assembled into] binary machine code, and then there's an interpreter for machine code that's implemented in hardware.

Perhaps what's most obviously missing in this course is an example at the other end of the spectrum, where relatively little preprocessing is done, and most things are worked out at run-time. A good place to settle is with an interpreter where the run-time representation of the program is an abstract syntax tree. Happily, many examples of such interpreters are to hand, because the third year course on Principles of Programming Languages is based around a sequence of such 'source level interpreters' for different languages.

Which procedures can be called at each point in a (non-higher-order) program?

The procedures that are visible in the body of each procedure P are those that are in scope, i.e., those that are declared locally to P, those that are declared in one of the nest of procedures that enclose P, and those that are global. In a block-structured language, the scopes form a tree, where the parent of each scope is the scope that encloses it. The procedures callable from P are the children of its ancestors, including P's own children and the siblings of P.

What code can be used to 'follow static links n-m+1 times'?

Given a pointer to the base of a frame, we can fetch the static link from the frame with the sequence


(which can be abbreviated LDNW 12). To call a procedure that is local to the current procedure (i.e., when m = n+1, we pass as the static link the value of our own base pointer, which can be obtained with the instruction LOCAL 0. So the general picture is

LOCAL 0 / (n-m+1) times LDNW 12

As a slight optimisation, LOCAL 0 / LDNW 12 is equivalent to LDLW 12.


Why not generate code for the x86?

In previous iterations of this course, we used almost a sanitized subset of the x86 to good effect; and you may still find some references to this in the course materials. It was a 'subset' because there were many features of the x86 that we ignored, as most compilers do. I say 'almost' a subset because the x86 has a number of annoying restriction that we just ignored. And I say 'sanitized' because we used a cleaned-up syntax for the assembly language. This year, as an experiment, I'm using the ARM instead. It's attractive because it's a simple, modern architecture with a bigger set of registers, all uniform, and unlike our invented and sanitized subset of x86 code, there is plenty of tutorial material to be found online.

What's the difference between the instructions CALL and PCALL?

Some other languages implemented on Keiko don't have nested procedures, or don't pass the static link in the same way that we do, and they use the CALL n instruction to call procedures; it expects the procedure address and n arguments on the stack. We want to pass the static link as if it were an extra argument, so to avoid confusion I introduced the PCALL instruction, defined so that PCALL n is a synonym for CALL n+1; similarly, PCALLW n is a synonym for CALLW n+1, an instruction that calls a procedure that expects n arguments and a static link and returns a one-word result. I hope I've managed to eradicate all mention of the CALL instruction from what you read, but some mentions might leak through.

The Keiko machine, by the way, decomposes these instructions into smaller pieces – an operation JPROC that jumps to the procedure, and an operation SLIDE that tidies up the stack after the procedure has returned.


What font do you use in the printed materials for the course?

The fonts all come from the Lucida family, which is well set up for use with TeX. The body font is Lucida Bright.



How did the cleaned-up version of the x86 differ from the real machine?

A good example of the cleaning up we did is that it's possible to access bits 0 to 7 of the EAX register of the x86 as an 8-bit quantity, as in the instruction mov al, x that stores these 8 bits in the location x. But it's not possible to do the same with the low-order 8 bits of some other registers, such as ESI, that are otherwise equivalent to EAX. We just ignored that problem.

If you ignore restrictions like that, doesn't it make it pure luck whether your object code will work?

If we simply ignored them, then yes. But what we did was to use a special assembler that plasters over the cracks. In the language of this assembler, you can write,

stb %0, x

to store the low-order byte of register %0 (equivalent to EAX), and you can write

stb %5, x

to do the same thing taking the value from register %5 (equivalent to ESI) instead. The assembler inserts some extra code to get round the restrictions in the processor, but we won't need to be concerned with that. Actually, the special assembler is made up of a translation pass in front of the ordinary assembler for the x86, and that made it possible to see the actual x86 code if you really want.

Arising from lectures

How can we prove whether a grammar is LR(0) or SLR or LALR or LR(1)?

A grammar is LR(0) if each state of the LR(0) automaton that permits a reduction (i.e. has an item with the dot at the right) contains only that reduction – so that neither a shift nor any other reduction is possible.

A grammar is SLR if, in each state that permits one or more reductions, the decision whether to shift or reduce (or the decision which reduction to perform) can be made on the basis of FOLLOW sets. That is, if a state contains the item [S α .] then we compare FOLLOW(S) with the set of tokens that can be shifted in the state (and also with the FOLLOW set of other possible reductions). If these sets are disjoint, then the grammar is SLR.

A grammar is LALR if a similar condition holds, based not on FOLLOW sets but on lookahead sets, which are always subsets of the FOLLOW sets. I'm not telling you how to compute lookahead sets, and I won't be asking you to classify grammars as LALR or not, unless they are either SLR or not LR(1).

A grammar is LR(1) if it has a correct LR(1) parsing table. A grammar is not LR(1) if it is amibguous. And a grammar also fails to be LR(1) if there is any viable prefix and lookahead token where different completions of the sentence require different immediate actions. To prove that a grammar is not LR(1), find two sentences that share a common prefix, but require different actions at the end of that prefix, despite the stack contents and the lookahead token being the same. For example, if you can say, "When you see an identifier, you can't tell whether it's a continuation of this expression or the start of a new one," then that's a sign the grammar is not LR(1).

Are conditions evaluated with the short-circuit method even if they appear in parentheses?

Yes: we can be sure of that, because parentheses are used by the parser to determine the shape of the abstract syntax tree for an expression, but don't appear explicitly in the tree itself. So adding superfluous parentheses to an expression does not affect the abstract syntax tree that is built, and thus cannot affect the code our compiler generates for it.

Type metrics consist of a size and an alignment; what is the alignment for?

Typically, byte-addressed machines have restrictions about what addresses can be used for quantities that are larger than a byte. For example, there may be a rule that a 4-byte integer must be aligned so that its address is a multiple of 4. That would mean that an array of bytes could begin at any address, but an array of integers would have to be given an address that was a multiple of 4. So the type 'array 10 of integer' has size 40 and alignment 4, but the type 'array 40 of char' has the same size but alignment 1. When the compiler lays out local storage for a subroutine or lays out a record type, it must leave gaps in order to satisfy these alignment restrictions, so that a character followed by an integer would leave a gap of 3. Sometimes space could be saved by shuffling the fields so that they appear e.g. in decreasing order of alignment, but that doesn't really matter much.

Recent x86 machines support unaligned loads and stores (of, say, a 4-byte quantity at an address that is not a multiple of 4) for backwards compatibility, but it's likely that they do so inefficiently, perhaps turning a load into two loads of adjacent words followed by some shifting and bitwise or-ing. So it's desirable to obey some alignment restrictions even when the architecture doesn't insist on them.

What's all that stuff about minus signs in integer constants?

If you don't make the minus sign in -2147483648 part of a single token with the digits, then the token 2147483648 is an integer constant whose value doesn't fit in 32 bits. In practice, this need not be a problem, because the result of - 2147483648 will be correct modulo 232 and therefore equal to -2147483648. But if in theory your language doesn't allow constants that don't fit in 32 bits, then you will be led to make a complicated exception for this case. Such an exception appears in Section 3.10.1 of the Java Language Specification. None of this is important, except as an example of the 1001 details that need to be got right in language and compiler design in order that ordinary programmers don't need to worry about them.

And what was that about typedef names in C?

In a C program, usually x * y will be interpreted by the compiler as multiplying two variables x and y. An expression can appear on its own as a statement: in this case writing

x * y;

as a statement all by itself is useless, but the rule is there to allow you to write a subroutine call as a statement, for example. So it's a fact that x * y; is a syntactically valid statement. But after the type definition

typedef int x;

what was syntactically a statement becomes a declaration: x * y; now declares y as having type pointer-to-integer. The upshot is that the parser must not treat all identifiers as interchangeable, but must rely on the lexer to classify each identifier as an ordinary identifier or as a typedef name that introduces a declaration. Because the set of typedef names can change, this requires an interaction between the parser and the lexer that is not envisaged in our simple data flow. The problem can be solved, but it's a bit messy; there's not really any reason to design a language to require this kind of messy trick in the compiler, and it's a symptom of the fact C evolved over a lengthy period.


Why does this program in the language of Lab 2 loop forever?

repeat x := x+1 until true

Congratulations on winning the prize for Bug of the month!

Perhaps the word true appears in magenta in your editor, as if it were a language keyword. But that's just because the editor is assuming your program, stored in a file with extension .p, is written in Pascal. In our language, true is not a keyword but an ordinary variable name. In Lab 2's language, variables don't have to be declared, and they have an initial value of 0, which is interpeted as Boolean false. Thus your program loops until the variable true has a non-zero value, which it never will.

(Actually, true is not a reserved word in Pascal but a predefined identifier. That makes it legal for a Pascal program to contain the declaration const true = false if you want.)


Arising from lectures

How does a semantic analyser deal with programming languages that have generic types but ones that promise type-safety at compile time?

Sadly, there is no time in the bijou lecture courses we teach in Oxford to cover such things. The archetype for languages with generics is ML with its parametric polymorphism, or (if you want pararmetized modules) Standard ML. Luca Cardelli's tutorial paper 'Basic Polymorphic Typechecking' (google it) is very helpful. – Mike

When implementing vtables, why do private methods need to be in the vtable? As far as I'm aware, they can't be overridden or called by subclasses and so it would be perfectly fine to just call them directly without looking them up.

Depending on the language, if you can prove that a method won't be overridden in a subclass, then calls to it can be compiled statically. That's the purpose of the 'final' label in Java and the way that methods are only optionally 'virtual' in C++. I guess in Java that private methods may as well be treated as final by the compiler, but I haven't thought about it. You could try compiling and disassembling some sample programs to find out what the Java compiler does; but I suspect it leaves the decision to the JIT, which knows more than it can about the whole program. – Mike

Was the cost-vector method of instruction selection based on a technique from a book somewhere that I could look up? It bears some resemblance to the method presented in the dragon book, except that uses an array of length N, and works out costs for all i from 0 to N based on using i of the N registers, your cost-vector version seems to use costs based on registers, addresses and constants.

I don't have the book with me at the moment, but I believe you can see the same method used in Andrew Appel's book Modern Compiler Implementation. Actually, we covered both the Dragon Book algorithm for optimal evaluation of simple expressions with a limited number of registers, and also the tree grammar based instruction selection algorithm, where we assumed a plentiful supply of registers. Both use dynamic programming, but to solve different problems.


In your lectures, you say that LDGW _x is equivalent to CONST _x / LOADW, but the code for lab 3 seems to use GLOBAL _x / LOADW instead. What's going on?

I decided at some stage that it was simpler to have separate instructions, CONST for numeric constants and GLOBAL for pushing an absolute address, but I didn't maintain concentration for long enough to make the change consistently. There's no deep issue here, because both instructions have the effect of pushing a global address onto the stack. – Mike

Why do I get this error message when I try to build Keiko?

gcc -g -O2 -m32 -fno-strict-aliasing -Wall  -c gc0.c -o gc0.o
gc0.c:35:36: error: unknown type name ‘bool’
make: *** [gc0.o] Error 1

For a short time, there was an error in the source for gc0, with bool appearing in place of boolean. I've corrected that now in the Subversion database, so you can download a better version by giving the command svn up:

$ svn up
U keiko/Makefile
U keiko/gc0.c
$ cd keiko
$ make

And everything should be fine. You may see more files being updated than just Makefile and gc0.c, because I installed the latest version of Keiko, modified so that it is written in the common subset of C and C++.


Arising from lectures

Wouldn't a stack of hash tables be a reasonable implementation of a compiler's symbol table?

Yes, it would. Hash tables, though, are an inherently imperative data structure, because adding an element to a hash table changes the table, and the only feasible way round this is to copy the whole table first, losing whatever gains in efficiency were made by using a hash table in the first place. Having a stack of hash tables helps to make things a bit less imperative, because at least then there's a separate representation of each scope. But the operation of adding a definition remains one that works by side-effect.

I prefer to use an interface for the dictionary or symbol table (call it what you like) that is purely applicative, i.e., doesn't have operations that change a hidden state of an environment, because that allows the symbol table module to satisfy a simpler kind of specification, in which environments are more like values than like (updatable) objects. Another confusing thing in typical compiler books is the confusion that's introduced between a definition of a name and its existence in the symbol table. It helps immensely with clarity to say that each declaration creates a definition, and the semantic analyser annotates the tree with these definitions, and they continue to exist as annotations even after the semantic analysis is finished. If the compiler is written (as all modern compilers should be) in a language with dynamic storage allocation, there's no need to worry about allocating and managing storage for these definitions. (I almost wrote 'garbage collection' rather than 'dynamic allocation' there, but the truth is that many things in compilers work perfectly well without garbage collection: the compiler runs, it uses the storage it needs, and then all the storage is reclaimed at the end when compilation is over.)

This isn't a data structures course, so we don't need to discuss what would be the best data structure for representing these 'persistent' environments, except perhaps to satisfy ourselves that a reasonably efficient implementation exists. So it's enough to say 'use a binary search tree' and possibly 'use some kind of balanced tree'.

There's nothing wrong with using hash tables; it's just that doing so places a constraint on the design of the compiler, and makes us view semantic analysis more as a process and less as a calculation. That said, I'm hugely keen on using one big hash table (TM) to represent all identifiers by distinct integers inside the compiler, so that the only string comparisons are done in the lexer. That doesn't affect the structure of the rest of the compiler.

Is it true that in C you can't have more than two levels of pointers?

I can't find any trace of this rule. GCC with the -ansi flag happily compiles the program,

typedef int **********ptr;

and the grammar given in the back of K&R seems to allow it. (Though to be fair it allows many things that don't make any sense as C programs.)

On the other hand, recursive types are possible in C only via the use of structure tags. The magic spell I always use looks like

typedef struct node *ptr;

struct node {
     int data;
     ptr next;

and it's plain that C uses name equivalence for structures, inventing structure tags (like node here) if they are not written in the source code. So the answer to the name vs structural equivalence question is clear. – Mike

See also this StackOverflow answer – Callum Rogers


Should the program sumpow.p in Lab 4 print 1300 or 979? I'm confused!

So was I, and the version delivered with the lab materials is inconsistent with what I wrote in the lab manual: one prints 1300 = 1^5 + 2^5 + 3^5 + 4^5, and the other prints 979 = 1^4 + 2^4 + ... + 5^4. I'm about to correct this by changing the lab materials, and afterwards svn up will get you the corrected progam; but whatever version you have, the correct answer is given in the comment embedded in the file sumpow.p, and that is the data used by make test.

According to the oracle,




So putting n = 5 in one we get 5 * 6 * 11 * 89 / 30 = 979, and putting n = 4 in the other we get 16 * 25 * 39 / 12 = 1300.

If, despite the fact that the above error was corrected long ago, you're still getting 1300 as the answer, it's because you are pushing the parameters of sumpow in the wrong order.

I'm trying to build the Keiko bytecode interpreter for Windows using MinGW, and it doesn't know about the function posix_memalign that's used in gc0.c. Is it important, or would malloc do instead?

For the bytecode interpreter (no JIT), it doesn't matter. In the JIT, the function scratch_alloc is used to allocate pages of memory that will subsequently be made executable, and that requires the memory to be aligned on a page boundary.

In OBC itself, the semi-portable function posix_memalign isn't used; for unix machines, I use mmap to allocate memory, and on Windows I use VirtualAlloc. See the file gc.c in the OBC sources.


What's going on here? Defining the function g seems to change the type of r.

mike@staysail:~$ ocaml
        Objective Caml version 3.12.0

# let r = ref [];;
val r : '_a list ref = {contents = []}
# let g x = r := [x+1];;
val g : int -> unit = <fun>
# r;;
- : int list ref = {contents = []}

The type '_a is not a polymorphic type variable; it's an unknown type, and all uses of r must treat it consistently. Such types are necessary when type inference meets imperative constructs, otherwise it becomes possible to set a variable like r to a list of integers and then use it as if it held a list of booleans. Defining the function g imposes the constraint that r must hold lists of integers, and that's reflected in the type that's later displayed for r.


I like to use spaces in the names of directories, but when I do this, the compile script in the labs breaks: it refuses to run the linker. And even after I've fixed that problem, the compile script creates an a.out that won't run. Isn't this a serious bug?

It certainly is a show-stopper, especially for people who are used to My Documents, My Music (for mp3's downloaded from file-sharing sites), and My Pictures (don't ask), and perhaps want to add My Little Pony too. You can fix the problem by replacing the compile script with this one:


KEIKO=`cd ../keiko; pwd`

set -x

./ppc $1 >a.k \
    && "$KEIKO/pplink" -custom -nostdlib -i "/usr/bin/env $KEIKO/ppx" \
        "$KEIKO/lib.k" a.k -o a.out >/dev/null \
    && chmod +x a.out

I won't update the lab materials, because this script depends on an icky feature of #! lines on Linux – that the tail of the line is passed as a single argument – and I'd rather keep things clear and simple when I can. – Mike 18:42, 1 February 2011 (UTC)

Personal tools