# Frequently asked questions (Programming Languages)

*Some questions from previous years have been moved to the FAQ archive.*

## General

**Q: Can I ask any kind of question here?**

A: Yes you can. Questions might be about the way the course is organised, or about the material we cover. I'll probably remove organisational questions as the answers become clear, or as they are dealt with elsewhere --Mike

**Q: Is Functional Programming really necessary as a prerequisite for this course?**

A: Yes, I'm afraid so. We will be using a number of things from the start of the course that are only covered towards the end of a beginning course on Functional Programming. There's no way to avoid this.

**Q: Why are monads in Haskell so hard to understand?**

A: Perhaps because three things come together: (i) Monads as a way of organizing and structuring a program is a way that makes algebraic sense; (ii) Monads as an interface to parts of Haskell that are implemented in a non-pure way; (iii) The *do* notation, which allows monadic code to look (superficially) as if it is written imperatively. --Mike

**Q: Is this really a "FP made difficult" course, or could we say it's "Haskell made difficult" instead?**

A: We will use next to none of the more obscure features of Haskell, because one of our goals is to explain the meaning of different features that we can add to the object language *Fun*, and the more features we assume in the meta-language *Haskell*, the less we can claim to have given a good explanation.

So the main things that are needed are:

- Algebraic data types,
- Recursive definitions,
- Higher order functions.

And those things are present in almost every functional programming language. We'll be using higher-order functions in a particular pattern called *monads*, but we won't be using the monad-related *features* of Haskell, and monads will in any case be fully explained, with plenty of examples. --Mike

## Compiling the interpreters

**Q: How can I modify the Fun interpreter to print the abstract syntax of each expression?**

A: Find the equation that reads

obey (Calculate exp) env = (print_value (eval exp env), env)

and change it to

obey (Calculate exp) env = (show exp ++ "\n" ++ print_value (eval exp env), env)

– Mike

**Q: What are the flags -fpedantic-bottoms and -fno-state-hack that are used with recent versions of the interpreters?**

A: The interpreters create primitives by wrapping a lambda expression with a function that catches any failure of pattern matching and prints an appropriate error message: for example, the addition operator might be bound to the function

primwrap "+" (\ [NumVal a, NumVal b] -> NumVal (a+b))

In we write `3+4`

as an expression, then the lambda expression dutifully returns the number `7`

, and all is well; but if we try `true + false`

, then the lambda expression raises a pattern matching exception, which the wrapper `primwrap`

catches and turns into an error message that says "bad arguments to primitive `+`

". This mechanism is useful because it allows us to give a one-line definition to each primitive of the language, and avoids needing a separate error case for each primitive. To work, however, it depends crucially on the timing of the pattern matching exception, which must be raised at a time when `primwrap`

can catch it, and not later when `primwrap`

has returned. Sadly, the Glasgow Haskell Compiler contains program transformations that can affect this timing, especially in monadic interpreters where the monad is a function type or involves I/O, and the primitive has a monadic effect rather than being a pure primitive like `+`

. In particular, where a `case`

expression has a function type, the compiler will delay the pattern matching until all the arguments are supplied; this preserves the functional meaning of the program but alters the timing of exceptions. The compiler flag `-fpedantic-bottoms`

(cute name!) disables this transformation, and the flag `-fno-state-hack`

disables a similar transformation that affects the *IO* monad.

## Clarifications

**Q: Is the triangle function actually secretly the bind function?**

A: Yes, what I write as ▹ is the thing that's written `>>=`

in Haskell. Except that I spell it `$>`

in source code for reasons that will become obvious. --Mike

**Q: Your definitional interpreter for FunMem uses two explicit mappings: an environment that maps identifiers to values, with one kind of value being a location; and a memory that maps locations to storable values. Isn't it possible to combine these into a single mapping?**

A: In some restricted circumstances, yes it is. But doing so in a language that combines nested scopes with assignable variables requires some kind of dirty trick, as I explain elsewhere. – Mike

## New questions

**Q: In library the type (→) a which is a → is an instance of Functor typeclass, I suppose that → b is also a Functor. If so, how can we implement it in Haskell?**

[I noticed this question long after it had been asked. I'm sorry about that, but will answer it anyway].

*a* → ? is a functor because if *x* is any type, then *a* → *x* is a type, and what's more if *f* :: *x* → *y* then we can define
*lift* *f* : (*a* → *x*) → (*a* → *y*) by *lift* *f* *h* = *f* . *h*. This definition satisfies the laws *lift* *id* = *id* and *lift* (*g* . *f*) = *lift* *g* . *lift* *f* as it should.

On the other hand, ? → *b* is a *contravariant* functor. For any type *x*, we can form the type *x* → *b*, but if *f* :: *x* → *y*, what we can define is *colift* *f* :: (*y* → *b*) → (*x* → *b*), by writing *colift* *f* *h* = *h* . *f*. This satisfies the properties *colift* *id* = *id* and *colift* (*g* . *f*) = *colift* *f* . *colift* *g*.

**Q: Are the lecture notes a draft of a new functional programming book? Will you publish it? Thanks.**

Yes, all being well. Finding time to tidy up all the details is very difficult.

**Q: How does the parser work? Will it match as long as possible? When I put Loop expression after While it won't work. However I put it before While and If it will work fine. I saw there is a More in Result of **

`Parsing.hs`

, is that for looking ahead indefinitely, which means it is a LL(n) Parser?Some of the details of the *Parsing* module are explained in Appendix B of the lecture notes: appendices. I'm not sure I understand what you're asking when you want to know if the parser will 'match as long as possible'. The parser matches inputs that conform to the grammar it's been given, and in the parser for `Fun`

, things are arranged so that the body of an `if`

or `while`

must be enclosed in parentheses if it consists of more than one statement. What's not clear to me is whether you're asking about where you should insert the `loop`

construct into your parser, or where `loop`

constructs can appear in the input language. If you're trying to complete Lab 3, then perhaps the lab demonstrator will be able to help you.

The *More* *k* result from parsers is there so that the parser can request more input, so as to deal with the case where a complete phrase did not appear on one line of input, and the parser would like the top level to request another line and feed it to the parser. The value *k* is a continuation that is called when more input is available. This has nothing to do with the class of grammars that can be handled by the parser combinators, namely the set that can be recognised top down in a deterministic way. This includes the LL(1) grammars.

**Q: I've installed the GHC compiler and I'm trying to compile FunLab2 from the 2nd practical, but it keeps saying "Parsing.hs:17:8: Could not find module System.Posix". How do you get around that?**

A: I'm guessing that you're using a Windows machine, and you've downloaded the file `labs.tar.gz`

or checked out the labs from the Subversion repository. Both these places contain the Unix version of `Parsing.hs`

; what you need is the version of that file that's contained in the archive `labs.zip`

on the laboratory exercises page.
All the other files are the same in all versions, so it's enough to add a copy of the Windows version of `Parsing.hs`

and recompile.

**Q: Why did you say that the pipeline for computing primes is not the sieve of Eratosthenes?**

A: See this fascinating article: http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf

**Q: Your language FunPipe was implemented using continuations. But a process in that language is just a kind of stream function that maps an input stream to an output stream. Couldn't you forget the continuations and implement the language using lazy lists instead?**

For the purely functional variant of the language, yes – at least, provided you're prepared to ignore the unimportant detail of what value a process or pipeline yields if it terminates, and you compromise on the ability to accept input from the terminal. You can replace the monad type used in my interpreter with

type M a = [Value] -> ([Value], [Value], a)

with the idea that if (*inp*_{1}, *out*_{1}, *x*) = *xm* *inp* then *inp* represents the stream of values that process *xm* can read, *inp*_{1} is returned as the unconsumed part of the input, *out* is the output produced by *xm*, and *x* is the value it yields. All the operations we used in the lecture can be defined on this type, with the operation *pipe* *xm* *ym* connecting the output produced by *xm* to the input accepted by *ym*. The prime sieve program works nicely.

What you can do with continuations but not with streams is describe a language that combines these concurrent processes with, say, assignable variables. You can use assignment to detect the time at which various actions occur, and when implementing `pipe xm ym`

in a monad like the one above, it is very hard to arrange that the actions of `xm`

are interleaved with those of `ym`

, or to incorporate memory in a way that does not destroy the laziness. With continuations, all this happens naturally: we just add a memory state as an argument of each continuation, and thread a single memory through the entire computation.

**Q: How are call by reference and call by name different and what are legal parameters that could be passed to functions that are implemented in one of those ways? For example, if we have a function G(x) and x is passed by name, are we allowed to call G(y+z) for some variables y and z? This doesn't make sense when we make a call by reference?**

A: Parameters passed by reference make sense in a language where references are denotable but not (necessarily) expressible, like *Fungol*. There we can require that actual parameters are variables (or more generally, expressions with an l-value), and call the procedure by making the formal parameters denote the same references as the actual parameters. This form of parameter passing is commonly provided in languages in the Pascal family. Your example `G(x+y)`

would not be allowed if the parameter is passed by reference.

Parameters passed by name may be any expression, but the expression is not evaluated at the time of the call, but each time the parameter value is needed. It's common today in lazy pure functional languages, where the purity allows the optimisation to parameters passed by need, where the value is calculated the first time it is needed, then saved and reused if the parameter value is needed again. Lab 2 concerns a language that is not purely functional, but has explicit dereferencing, where references are expressible. In that language, we could write the call `G(!y+!z)`

to ask that the contents of `y`

and `z`

are evaluated afresh each time the formal parameter `x`

occurs in the body of `G`

.
In the lab, the expression

sum(i, 0, 10, !i * !i)

is used; with a suitable definition of `sum`

, this varies `i`

from 0 to 9, summing the values taken by the expression `!i * !i`

as it goes. This trick was known as Jensen's device in Algol 60. Because in our language references are values, the first occurrence of `i`

is passing a reference to `i`

, while the two occurrences of `!i`

denote the contents of that reference at the time the parameter is used in the function body. These parameters are, of course, evaluated in the environment of the function call, not the environment where the formal parameter occurs in the function body.

[FORTRAN provided (only) a variant of parameters passed by reference where the actual parameter could be an arbitrary expression; if it had no l-value, then a location would be allocated to hold the value, and the formal parameter would denote that location in the body of the procedure. Algol 60 had both parameters passed by value and a variant of parameters passed by name such that if the actual parameter had an l-value, then the formal parameter would denote the same location (recalculated on each use); that gave the benefits of parameters passed by reference.

It's not clear to me that the designers of Algol 60 had a complete appreciation of what their parameter mechanism implied when they defined it.]

**Q: Why did we require that Rec definitions define only functions in the original Fun and FunMem?**

Recursive definitions of anything but functions commit us to some kind of laziness. For *Fun*, that's not necessarily a problem, but for *FunMem* it is: we don't want to deal with expressions like

let val g = new() in let rec x = g := x; 3 in !g

that ought to have the value 3, but won't do so in any sensible implementation. The same holds when we make a monadic interpreter, unless we assume that the monad itself supports a suitable fixpoint operator *fix* :: (α → *M* α) → *M* α.

**Q: The join operation in Exercise 2.10 can make some effects of the Monad disappear, for example in the list monad, join [[1,2,3],[4,5]] = [1,2,3,4,5], which means we lose the length of the sublists. And in the Maybe monad, join Nothing = join (Just Nothing) = Nothing, which also loses the infomation about how the computation fails.**

**(1) Are there some situations in which the monad will not lose any information, such as the identity Monad? (2) Is the State monad the same as the Memory monad? What is the join operation for it?**

If *join* :: *M* (*M* α) → *M* α is an injection, then it seems likely that *M* α and α are in one-one correspondence, so we are in the identity monad or something very much like it. Yes, the monads of Memory and State are very similar, except that Haskell's State monad is parametrised by the type of states, and we take the state space always to be memories *Mem*. The join operation is

join :: (Mem -> (Mem -> ('a, Mem), Mem)) -> Mem -> ('a, Mem) join xmm mem = let (xm, mem') = xmm mem in xm mem'

I don't really know how to describe that in helpful words.

Note that 'losing information' here means ignoring the way an outcome was produced, and concentrating on the outcome itself, viewing programs extensionally in a sense that allows us to assert meaningful identities between programs.

**Q: In the lecture, you said that 3(square) is a syntactically valid Fun program, though one that has no value. What if we wanted to rule it out as a valid program, and say that it was a syntax error?**

A: You could (in principle) easily change the parser so that in a function application `f(x)`

, the `f`

had to be an identifier; and you could do this without changing the abstract syntax. But I understand you want to allow any expression except a number to appear as `f`

, and you want this restriction reflected in the abstract syntax.

You can do this by introducing two mutually dependent types *Expr* and *Expr*_{2}, something like this:

data Expr = Number Integer | Nonnum Expr2 data Expr2 = Variable Ident | Apply Expr2 [Expr] | IfExpr Expr Expr Expr | ...

(Note that the first argument of *Apply* must be an *Expr*_{2}.) That's OK, but it will lead to a lot of clutter, and really it's best from a practical point of view to accept a more liberal abstract syntax.

Neither of these moves removes the need to deal during evaluation with the problem that a function call may refer to a non-function: consider the program

let val three = 3 in three(square)

In a typed version of Fun, both this and the original expression would contain a type error; in our untyped interpreter, both lead to the error "applying a non-function".

**Q: Instead of writing,**

xm $> f = (\ mem -> let (x, mem') = xm mem in f x mem'),

**wouldn't be better to define** *xm* ▹ *f* = *uncurry* *f* . *xm* **?**

A: Perhaps so, because then the general law *uncurry* (λ *x* → *p* . *q* *x*) = *p* . *uncurry* *q* would allow us to calculate

(xm $> f) $> g = uncurry g . uncurry f . xm = uncurry (\ x -> uncurry g . f x) . xm = xm $> (\ x -> f x $> g)

And what's more, *result* = *curry* *id*, so

xm @> result = uncurry (curry id) . xm = id . xm = xm.

and even

result x $> f = uncurry f . curry id x = f x

Neat!