CS 242: Assignment 1: Lambda Theory

Due: Wednesday, October 3 at 4:00pm
Submission cutoff: Saturday, October 6 at 4:00pm

We discussed in class the basic mechanics of the lambda calculus, both how it works intuitively and how to formally specify its semantics (see the lecture notes). In this assignment, you will do a deep dive into its models of computation and scope.

Setup

For each assignment, the writeup will be hosted on the course web site (as with this page), and any material such as starter code or LaTeX templates will be provided in the course repository: https://github.com/stanford-cs242/f18-assignments

To get started, install Git if you haven’t already, then clone the repository:

$ git clone https://github.com/stanford-cs242/f18-assignments
$ cd f18-assignments

For all future assignments, you’ll just need to run git pull once the assignment is released. Each assignment will have an appropriately named directory—this one is assign1.

This assignment has both a written portion (parts 1 and 3) and a programming portion (part 2). For the written portion, you must create a PDF file containing your answers. In assign1/written we have provided you a Latex template which you are encouraged to use. If you are unfamiliar and do not wish to learn Latex, then you may also hand-write your assignment. Your handwriting must be legible, and the scan must be clear.

1: Be the machine (10%)

Following the lambda calculus small-step semantics, reduce (by hand) the following expressions until they meet one of these conditions:

The expression becomes a value.
The expression enters a stuck state: no rule applies to the expression.
The expression reaches a “fixpoint”: each subsequent step of the expression steps to the same value.

More formally, for each expression $e$ , repeatedly apply small steps until one of:

($ \val{e} $)
($ \neg(\val{e}) \wedge \not\exists e’ . \steps{e}{e’} $)
($ \steps{e}{e} $)

For each expression, provide its final form and clearly identify which condition is met.

($ \fun{x}{x} $)
($ \fun{x}{y} $)
($ x $)
($ \app{(\fun{x}{\fun{y}{\app{x}{y}}})}{(\fun{y}{\app{y}{y}})} $)
($ \app{(\fun{x}{\app{x}{x}})}{(\fun{x}{\app{x}{x}})} $)
($ \app{\app{(\fun{x}{\fun{x}{x}})}{a}}{b} $)
($ \app{(\fun{x}{\fun{y}{x}})}{z} $)

Put your answers in part 1 of the written portion.

2: Church encoding (55%)

Note: for any problem in this section, if you find yourself spending more than 30 minutes working on it, come to office hours with questions!

Part of understanding a computational model is knowing its expressiveness: what range of programs can be written in the model? For example, consider the computational model of arithmetic, or expressions of numbers with addition, subtraction, multiplication, and division, e.g. ($1 * 2 - 3 / 4$). In arithmetic, we know that every program must evaluate to a value, i.e. it can never loop. While most normal programming languages permit partial functions (functions that could return a result or loop forever), we can prove that arithmetic is a language of total functions (they always return a result).

Intuitively, this means that arithmetic is less expressive than more general programming languages. The set of programs expressible in arithmetic is a strict subset of those expressible in, say, Python or Java. What about the lambda calculus? It seems like such a simple language, but can we formally reason about its expressiveness? It turns out the lambda calculus is equivalent in expressiveness to a Turing machine—this result is called the Church-Turing thesis. Both of these computational models can express computable functions. Informally, a computable function is “a function on the natural numbers computable by a human being following an algorithm, ignoring resource limitations.”

In the context of the lambda calculus, this idea will seem strange. If we only have variables and functions, how are supposed to express something as simple as ($2+2$)? The key idea is that we have to provide a mapping from constructs in the lambda calculus to natural numbers, which allows us to interpret the result of a lambda calculus program as a number. This is called the Church encoding. In the remainder of this section, we’ll work through a few exercises to understand how this encoding works.

2.1: Numbers (20%)

First, we’ll look at how to interpret a lambda calculus program as a natural number. This construction derives from Peano numerals, which is essentially the “tally marks” approach to representing numbers. Here, ($Z$) is the number ($0$), and $S(n)$ is the number ($n + 1$). For example:

($Z = 0$)
($S(S(Z)) = 2$)
($S(S(S(S(S(Z))))) = 5$)

A Church numeral for ($n$) is a lambda calculus function that takes two arguments ($f$) and ($x$), and applies ($f$) to ($x$) a total of ($n$) times. For example:

($\fun{f}{\fun{x}{x}} = Z = 0$)
($\fun{f}{\fun{x}{\app{f}{(\app{f}{x})}}} = S(S(Z)) = 2$)
($\fun{f}{\fun{x}{\app{f}{(\app{f}{(\app{f}{(\app{f}{(\app{f}{x})})})})}}} = S(S(S(S(S(Z))))) = 5$)

Key idea: a function can be a number, and what’s in the function defines exactly what number. It’s all a matter of interpretation.

We can then define lambda calculus functions to emulate functions on numbers. For example, the successor operation adds one to the input number ($n$):

$S = \fun{n}{\fun{f}{\fun{x}{\app{f}{(\app{n}{\app{f}{x}})}}}}$

Why does this work? Let’s walk through an example. If we had ($S(2)$), this would look like:

$\begin{align*} &\app{(\fun{n}{\fun{f}{\fun{x}{\app{f}{(\app{n}{\app{f}{x}})}}}})}{(\fun{f}{\fun{x}{\app{f}{(\app{f}{x})}}})} \\ \mapsto & \fun{f}{\fun{x}{\app{f}{(\app{(\fun{f}{\fun{x}{\app{f}{(\app{f}{x})}}})}{\app{f}{x}})}}}\tag*{[Substitute the 2 for $n$]} \end{align*}$

If this expression was the church numeral for 3, then it would apply the function ($f$) to ($x$) three times. Is that the case? Even though it’s not technically part of the semantics, we can simplify inside parts of the number to answer this question.

$\begin{align*} &\fun{f}{\fun{x}{\app{f}{(\app{(\fun{f}{\fun{x}{\app{f}{(\app{f}{x})}}})}{\app{f}{x}})}}} \\ \mapsto &\fun{f}{\fun{x}{\app{f}{(\app{f}{(\app{f}{x})})}}} \tag*{[Substitute inner $f$ and $x$]} \end{align*}$

From this, we can see that the resultant function does apply ($f$) three times to ($x$), hence this is the church numeral for three, and our successor function appears to work.

Now let’s mechanize this. In the assign1/program directory, we have provided a binary lci_{platform}, or lambda calculus interpreter, that executes an lambda calculus program and interprets it as a church numeral. Try this out:

$ ./lci_macos
> fun f -> fun x -> x
0
> z = fun f -> fun x -> x
> s = fun n -> fun f -> fun x -> f (n f x)
> (s (s z))
2

Note: this is a slightly modified version of the lambda calculus with persistent identifiers.

You can also use the interpreter to execute a standalone file.

$ cat test.lam
z = fun f -> fun x -> x
s = fun n -> fun f -> fun x -> f (n f x)
(s (s z))

$ ./lci_macos test.lam
Expression on line 3 evaluated to 2

Your first task is to define a multiply function. As per the Church encoding Wiki page, the addition and multiplication functions are usually defined as follows:

$\begin{align*} \msf{plus} &= \fun{m}{\fun{n}{\fun{f}{\fun{x}{\app{\app{m}{f}}{(\app{\app{n}{f}}{x})}}}}} \\ \msf{mult} &= \fun{m}{\fun{n}{\fun{f}{\fun{x}{\app{\app{m}{(\app{n}{f})}}{x}}}}} \end{align*}$

Instead of this definition, your job is to provide an alternate implementation of ($\msf{mult}$) that only uses ($\{m, n, \msf{plus}, Z\}$), as well as parentheses. A few hints.

Fill in the template: ($\msf{mult} = \fun{m}{\fun{n}{\underline{\hspace{3cm}}}}$)
Remember that although ($m$) and ($n$) represent numbers, they are still functions that you can use however you like.
Use the interpreter to help you experiment!

Put your answer in the assign1/program/mult.lam file.

2.2: Booleans (15%)

To introduce conditional logic, i.e. ifs/thens/comparisons, we need to use the Church encoding for booleans. Here, the idea is that true and false are functions which select a particular branch of two possibilities.

$\begin{align*} \msf{true} &= \fun{a}{\fun{b}{a}} \\ \msf{false} &= \fun{a}{\fun{b}{b}} \end{align*}$

For example, $\app{\app{\msf{true}}{Z}}{(\app{S}{Z})} \mapsto Z$ . From this, we can define our essential conditional operators:

$\begin{align*} \msf{not} &= \fun{p}{\fun{a}{\fun{b}{\app{\app{p}{b}}{a}}}} \\ \msf{if} &= \fun{p}{\fun{a}{\fun{b}{\app{\app{p}{a}}{b}}}} \end{align*}$

Again, see the Wiki page for more explanation and examples. In this section, your challenge is to define a function ($\msf{even}$) that takes as input a number, and returns true if it’s even, and false otherwise. You can only use the input number and the ($\{\msf{true}, \msf{false}, \msf{not}, \msf{if}\}$) constructs (no recursion!). Some hints:

Fill in the template: ($\msf{even} = \fun{n}{\underline{\hspace{3cm}}}$)
We can inductively define the $\msf{even}$ function over the natural numbers as: ($\app{\msf{even}}{Z} = \msf{true} \mid \app{\msf{even}}{S(n)} = \app{\msf{not}}{(\app{\msf{even}}{n})}$)

Put your answer in the assign1/program/even.lam file.

2.3: Recursion (20%)

A last critical piece missing from our computational framework is the idea of recursion. In the basic formulation given, you can’t trivially recurse within a function. When declaring a function, you can’t get access to the identifier binding the function. For example, when defining a recursive sum in Javascript, we can usually write:

function sum(n) {
  if (n == 0) { return 0; }
  else { return sum(n - 1) + n; }
}

Where we are allowed to reference sum inside of its own definition. But in the lambda calculus, if we try this idea:

$\msf{sum} = \fun{n}{\app{\app{\app{\msf{if}}{(\app{\msf{iszero}}{n})}}{Z}}{(\app{\app{\msf{plus}}{n}}{(\app{\msf{sum}}{(\app{\msf{pred}}{n})})})}}$

Then this is an invalid expression, because $\msf{sum}$ isn’t bound in the function. Instead, we can use the amazing magics of the fixpoint combinator (or more famously, the “Y combinator”). The fixpoint combinator looks like this:

$\msf{fix} = \fun{f}{\app{(\fun{x}{\app{f}{(\app{x}{x})}})}{(\fun{x}{\app{f}{(\app{x}{x})}})}}$

The $\msf{fix}$ function has the curious property of a fixpoint (which you hopefully discovered in part 1 of this assignment), which is that $\app{\msf{fix}}{f} = \app{f}{(\app{\msf{fix}}{f})}$ for all $f$ . See an extended example of this on the Wiki page.

Intuitively, the way we can use the fixpoint combinator is to provide a function with a handle to itself–this is the idea of self-reference. For example, we can define an infinite sum:

$\begin{align*} &\app{\msf{fix}}{(\fun{n}{n + 1})} \\ \mapsto& \app{(\fun{x}{\app{(\fun{n}{n + 1})}{(\app{x}{x})}})}{(\fun{x}{\app{(\fun{n}{n + 1})}{(\app{x}{x})}})} \\ \mapsto& \app{(\fun{n}{n + 1})}{(\app{(\fun{x}{\app{(\fun{n}{n + 1})}{(\app{x}{x})}})}{(\fun{x}{\app{(\fun{n}{n + 1})}{(\app{x}{x})}})})} \\ =& \app{(\fun{n}{n + 1})}{(\app{\msf{fix}}{(\fun{n}{n + 1})})} \end{align*}$

More practically, we can define recursive functions that terminate using conditions like in the Javascript example above. Your challenge in this section is to use the fix combinator to create a recursive sum function that adds the numbers from 1 to $n$ . You can use any of the constructs defined in the above sections or on the linked Wikipedia pages (except don’t use the closed form!).

Put your answer in assign1/program/sum.lam.

3: Dynamic scoping (35%)

Credit to Stephan Boyer for the inspiration.

Lexical vs. dynamic scoping

Recall from class that scope is the central idea in programming languages of: given a variable, what binding does it refer to? For example, consider this program (in Javascript):

let x = 1;
function g(z) { return x + z; }
function f(y) {
  let x = y + 1;
  return g(y * x);
}
console.log(f(3));

When executing the inner g function, it has the following call stack:

Take a second to walk through the code and verify this diagram matches your expectations. When executing g, the name resolution question here is: in the expression x + z what does x refer to? There are two options: the x defined in the top-level scope, and the x defined in the function f.

The first option, picking x = 1, is called lexical scoping. The intuition behind lexical scoping is that the program structure defines a series of “scopes” that we can infer just by reading the program, without actually running the code.

// The {N, ...} syntax indicates which scopes are "active" a given
// point in the program.

// BEGIN SCOPE 1 -- {1}
let x = 1;
function g(z) {
    // BEGIN SCOPE 2 -- {1, 2}
    return x + z;
    // END SCOPE 2
}
function f(y) {
  // BEGIN SCOPE -- {1, 3}
  let x = y + 1;
  return g(y * x);
  // END SCOPE
}
console.log(f(3));
// END SCOPE 1

The nested structure of our program implies that at any line, we have a stack of scopes as annotated in the program above. To resolve a name, we walk the stack and search for the closest declaration of our desired variable. In the example, because scope 3 is not in the stack of scope 2, x + z resolves to the binding of x in scope 1, not scope 3.

Again, we call this “lexical” scoping because the scope stack is determined statically from the program source code (i.e. before the program is executed). The alternative approach, the one that resolves to x = 4, is called dynamic scoping. Recall our call stack above:

As you may have inferred at this point, dynamic scoping looks at the runtime call stack to find the most recent declaration of x and uses that value. Here, the binding x = 4 in f is more recent than the binding x = 1 in the top-level scope.

In both cases, dynamic or lexical, the core algorithm is the same: given a stack of scopes, walk up the stack to find the most recent name declaration. The only difference is whether the the scope stack is determined before or during the program’s execution.

Scoping in the lambda calculus

Let’s now contextualize these ideas in the lambda calculus. The classical presentation of the lambda calculus is always lexically scoped, i.e. a variable always refers to the closest lambda with the same argument variable in the syntax tree. However, it’s possible to tweak the evaluation rules and implement dynamic scoping in the lambda calculus. For example, consider the expression:

$\app{(\fun{x}{\app{\app{(\fun{x}{\fun{\_}{x}})}{L}}{*}})}{D}$

Conventions: as a shorthand, we’re using the underscore for a variable that doesn’t matter, and the asterisk for a value that doesn’t matter. They could be anything and the example does not change. $L$ and $D$ are both values.

Under normal lexical scoping rules, this evaluates to:

$\begin{align*} &\app{(\fun{x}{\app{\app{(\fun{x}{\fun{\_}{x}})}{L}}{*}})}{D} \\ \mapsto &\app{\app{(\fun{x}{\fun{\_}{x}})}{L}}{*}\tag*{[Inner $x$ shadows outer $x$]} \\ \mapsto &\app{(\fun{\_}{L})}{*}\tag*{[Substitute $L$ for innermost $x$]} \\ \mapsto &L\tag*{[Throw away dummy value]} \end{align*}$

However, under a dynamic scoping system, the exact same example would evaluate to a different answer. Here’s the trace under such a scoping system:

$\begin{align*} &\app{(\fun{x}{\app{\app{(\fun{x}{\fun{\_}{x}})}{L}}{*}})}{D} \\ \mapsto& \app{(\fun{x}{\app{(\fun{\_}{x})}{*}})}{D} \tag*{[Step 1]}\\ \mapsto& \app{(\fun{x}{\app{(\fun{\_}{D})}{*}})}{D} \tag*{[Step 2]}\\ \mapsto& \app{(\fun{x}{D})}{D} \tag*{[Step 3]}\\ \mapsto& D\tag*{[Step 4]} \end{align*}$

This execution trace should look strange to you. We’re evaluating the inside of a function without substituting the variable! Let’s look at the operational semantics to better understand this idea.

$$ \ir{D-Lam}{ \ }{\val{\fun{x}{e}}} \s \ir{D-Var} {x \rightarrow e \in \ctx} {\dynJC{\steps{x}{e}}} \s \ir{D-App$_1$} {\dynJC{\steps{e_1}{e_1'}}} {\dynJC{\steps{\app{e_1}{e_2}}{\app{e_1'}{e_2}}}} \nl \ir{D-App$_2$} {\dynJ{\ctx, x \rightarrow e_2}{\steps{e_1}{e_1'}}} {\dynJC{\steps{\app{(\fun{x}{e_1})}{e_2}}{\app{(\fun{x}{e_1'})}{e_2}}}} \s \ir{D-App$_3$} {\val{e_1}} {\dynJC{\steps{\app{(\fun{x}{e_1})}{e_2}}{e_1}}} $$

These rules introduce a new concept in our PL metalanguage, the idea of a proof context. Here, $\ctx$ (uppercase gamma) is a variable representing our proof context, and $\dynJC{\steps{e}{e'}}$ means “given this proof context $\ctx$ , it is provable that $\steps{e}{e'}$ .” You can think about a proof context as a set of facts potentially necessary to prove a particular transition should occur.

Here, the proof context is a set of variable mappings, similar to our runtime stack in the Javascript example. The syntax $\ctx, x \rightarrow e_2$ means “add $x \rightarrow e_2$ to the context $\ctx$ ”. Like a normal dictionary, a new mapping can overwrite a previous one, for example:

$\begin{align*} \{y \rightarrow e_1\}, x \rightarrow e_2 &= \{y \rightarrow e_1, x \rightarrow e_2\}\\ \{x \rightarrow e_1\}, x \rightarrow e_2 &= \{x \rightarrow e_2\} \end{align*}$

The crux of the construction is in the D-App-2 rule. To evaluate a function, we step the body $e_1$ while placing $x \rightarrow e_2$ into the proof context. This allows any expression within $e_1$ to use $x$ , even if it wasn’t in its lexical scope. Leaving $e_2$ in the syntax while we step the function body is essentially using the syntax to maintain a runtime stack, which we reconstruct in our proof context on every small step.

For another clarifying example, here’s a valid expression:

$\app{(\fun{f}{\app{(\fun{x}{\app{f}{\_}})}{D}})}{(\fun{\_}{x})}$

Even though the $x$ is not in the lexical scope of the function $f$ , this will still evaluate to $D$ because $x$ is bound while $f$ is being executed.

Stop. Re-read the above two sections and ensure that you understand what dynamic scoping means, and why these semantics implement it. Most of your time in this part of the assignment will likely be spent understanding the background, not performing the task.

3.1: Evaluation proof (20%)

Using the above semantics, we can produce a formal proof to justify each step in the provided example trace. Below, we have provided you with the first step in the proof—your task is to fill in the proof for the remaining three steps.

$$ \text{Step 1:}\qquad \ir{D-App$_2$} {\ir{D-App$_1$} {\ir{D-App$_3$} {\ir{D-Lam}{ \ }{\val{\fun{\_}{x}}}} {\dynJ{\{x \rightarrow D\}}{\steps {\app{(\fun{x}{\fun{\_}{x}})}{L}} {\fun{\_}{x}}}}} {\dynJ{\{x \rightarrow D\}}{\steps {\app{\app{(\fun{x}{\fun{\_}{x}})}{L}}{*}} {\app{(\fun{\_}{x})}{*}}}}} {\dynJ{\varnothing}{\steps {\app{(\fun{x}{\app{\app{(\fun{x}{\fun{\_}{x}})}{L}}{*}})}{D}} {\app{(\fun{x}{\app{(\fun{\_}{x})}{*}})}{D}}}} \nl $$

Put your answer in part 3 of the written solution.

3.2: New semantics (15%)

Your final challenge is to extend the dynamically scoped lambda calculus with a new construct, the let statement. Let statements are like functions in that they bind variables for use in an expression. For example:

$(\letv{x}{1}{x + x}) \mapsto 1 + 1 \mapsto 2$

We could rewrite the initial example as:

$\letv{x}{D}{\app{(\letv{x}{L}{\fun{\_}{x}})}{*}} \evalto D$

Note: the double right arrow $\evalto$ means “evaluates to”, i.e. step any number of times until you reach a value as opposed to stepping once.

More generally, we extend our grammar with the following construct:

$\begin{alignat*}{2} e ::= \qamp \ldots \\ \qamp \letv{x}{e_1}{e_2} \end{alignat*}$

Provide the operational semantics for the let statement that preserves dynamic scoping. Your answer should be two inference rules. Put your answer in part 3 of the written solution.

Submission

To submit your work, we have two Gradescope assignments for the written and programming sections. For the written section, upload your assign1.pdf file. For the programming section, run ./make_submission.sh and upload the newly created assign1.zip file.

Sections