Type systems
Recall from last lecture that we had a big issue with our lambda calculus. We were able to construct programs with undefined behavior, i.e. ones that would evaluate to a stuck state, by having free variables in our expressions, like
Moreover, we had no way to easily enforce higherlevel constraints about our functions. For example, let’s say we had a function that would apply an argument twice to a function.
We could accidentally give this a function that only takes one argument^{1}, e.g.
Ideally, we could somehow restrict the allowable values for to the set of functions with two arguments (e.g. ).
Invariants
The desired properties above are all examples of invariants, or program properties that should always hold true. Invariants are things like:
 In the function , should be a number and .
 In my ATM program, customers should not withdraw more money than they have in their account.
 In my TCP implementation, neither party should exchange data until the initial handshake is complete.
 In the driver for my mouse, the output coordinates for the mouse should always be within the bounds of my screen.
There are three main considerations in the design of invariants:
 Structure. What is the “language” of invariants? How can we write down a particular invariant?
 Inference. Which invariants can be inferred from the program, and which need to be provided by the programmer?
 Time of check. When, in the course of a program’s execution, is an invariant checked? Before the program is run?
For example, consider the humble assert
statement. This is usually a builtin function that takes as input a boolean expression from the host language, and raises an error if the expression evaluates to false. In Python:
def div(m, n):
assert(type(m) == int and type(n) == int)
assert(n != 0)
return m / n
For these kinds of asserts, the language of invariants is the same as the host language, i.e. a Python expression. This is quite powerful! You can do arbitrary computations while checking your invariants. These invariants are never inferred—you have to write the assert statements yourself^{2}. And lastly, assert statements are checked at runtime, when the interpreter reaches the assert. Nothing guarantees that an assert is checked before code relying on its invariant is executed (e.g. accidentally dividing and then asserting in the case above).
By contrast, now consider the traditional notion of a “type system” as you know it from today’s popular programming languages. For most type systems, the language of invariants is quite restricted—types specify that a variable is a “kind” of thing, e.g. n
is an int
, but cannot specify further that n != 0
. Most stoneage programming languages require the programmer to explicitly provide type annotations (e.g. int n = 0
), but modern languages increasingly use type inference to deduce types automatically (e.g. let n = 0
). Lastly, types can be checked either ahead of time (“statically”, e.g. C, Java) or during program execution (“dynamically”, e.g. Python, Javascript)^{3}.
Key idea: type systems and runtime assertions derive from the same conceptual framework of enforcing invariants, just with different decisions on when and how to do the checks.
In this course, our focus is going to be on static analysis: what invariants can we describe, infer, and enforce before ever executing the program? And by enforcement, I mean iron law. We don’t want our type systems to waffle around with “well, you know, I bet this n
is going to be an integer, but I’m only like, 75% sure.” We expect Robocop type systems that tell us: I HAVE PROVED TO 100% MATHEMATICAL CERTAINTY THAT IN THE INFINITE METAVERSE OF BOUNDLESS POSSIBILITIES, THIS “n” IS ALWAYS AN INTEGER.
This is the core impetus behind most modern research in PL theory. Advances in refinement types, dependent types, generalized algebraic data types, module systems, effect systems, traits, concurrency models, and theorem provers have pushed the boundaries of static program analysis. Today, we can prove more complex invariants than ever before. While cuttingedge PL research is mostly beyond the scope of this course, you will be equipped with the necessary fundamentals to continue exploring this space.
Typed lambda calculus
To understand the formal concept of a type system, we’re going to extend our lambda calculus from last week (henceforth the “untyped” lambda calculus) with a notion of types (the “simply typed” lambda calculus). Here’s the essentials of the language:
First, we introduce a language of types, indicated by the variable tau (). A type is either an number, or a function from an input type to an output type . Then we extend our untyped lambda calculus with the same arithmetic language from the first lecture (numbers and binary operators)^{4}. We drop the hat syntax for simplicity. Usage of the language looks similar to before:
Indeed, our operational semantics are just the lambda calculus plus arithmetic. Zero change from before.
An interpreter for free
A brief aside: the main reason we’re using OCaml in this course (as opposed to, say, Haskell or Scala) is that feels quite similar to the typed lambda calculus. In fact, if we change a few keywords, we can use OCaml to execute exactly the language described above. (See the OCaml setup guide to follow along). If we wanted to transcribe the two examples above:
$ ocaml
# (fun (x : int) > x + 1) 2 ;;
 : int = 3
# (fun (f : int > int) > fun (x : int) > f (x + 1)) (fun y > y * 2) 5 ;;
 : int = 12
Of course, OCaml can do much more than this—it has strings, exceptions, if statements, modules, and so on. We’ll get there, all in due time. I point this out to show you that by learning the lambda calculus, you are actually learning the principles of real programming languages, not just highfalutin theory. When you go to assignment 3 and start on your first OCaml program, the language will feel more familiar than you may expect!
Type system goals
Before we dive into the type system, it’s worth asking the motivational question: what invariants of our language do we want to statically check? One way to answer this is by thinking of edge cases we want to avoid.
 Adding a number and a function:
 Calling a function with the wrong type:
 Incorrectly using a function argument:
This is an important exercise, since it gives us an intuition for where errors might arise. However, even if we had a method for completely eliminating the edge cases we thought of, how can we know we caught all the cases? What if we just didn’t think of a possible error?
Remember that all of these issues fundamentally boil down to stuck states, or undefined behavior. We specified our operational semantics over “welldefined” programs, but that doesn’t prevent us from writing invalid programs. As before, the goal is to take a program and step it to a value. This leads us to a safety goal: if a program is welldefined, it should never enter a stuck state after each step. If we can formally prove that this safety goal holds for our language, then that means there are no missing edge cases!
The goal of a type system, then, is to provide a definition of “welldefined” such that we can prove whether a given program is welldefined without executing it. Formally, we need a new judgment (binary relation) “ has type ”, written as . In the language above, it should be the case that and . To say an expression has a type is to say it is “welldefined” (or “welltyped”).
We can try and formally write type safety as the following theorem:
Type safety (strict): for all expressions , if , then such that and .
This definition is a little too strict because it mixes two notions: totality and stuckstateavoidance. This theorem is fine for total languages like arithmetic and the simplytyped lambda calculus, but we would also like to reason about type safety in languages that involve infinite loops and recursion like we’ll see tomorrow. So instead of the theorem above, we decompose the intuition of “avoiding stuck states” into the following two theorems:
 Progress: if then either or there exists such that .
 Preservation: if and , then .
Intuitively, progress says: if an expression is welltyped and is not a value, then we should be able to step the expression (it is not in a stuck state). However, this isn’t enough to prove our safety goal, since we also need preservation: if an expression is welltyped, when it steps, its type is preserved. For example, if we have an expression of number type, it shouldn’t turn into a function after being stepped. These two theorems are sufficient to prove that a welltyped expression will never enter a stuck state at any point during its execution.
Static semantics
In this conceptual framework of type systems, the first thing we need to do is define how we determine the type of an expression. In our grammar, we defined a type language, but now we need a type semantics (or “static semantics”). First, we’ll define the judgments for numbers:
As you can see, these are defined quite similarly to how we defined our operational semantics (or “dynamic semantics”). Each rule defines a different way to determine whether a particular expression has a particular type. Just like , the rule of is axiomatic—a numeric constant has type under all conditions. says: if the two subexpressions are both numbers, then the binary operation on those subexpressions is also an number. From these two rules, we can construct a proof that :
Ok, but what is this “” business? Or “”? To typecheck expressions with variables, we need to introduce a “typing context” that maps variables to their types. Intuitively, when typechecking , we want to remember that a usage of can assume . Formally, we write this as:
The notation means that “given the knowledge of , it is provable that .” The left hand side of the turnstile is a proof context (something we know), and the right hand side is a proof term (the thing we want to prove). specifically represents our type context. It is a mapping from variables to types. We can add new mappings to our context^{5}, indicated by .
Let’s read through the rules again. says that if our context says then has type . is the most complex: it says that to typecheck a function, we want to typecheck the body of the function assuming that , where is the type provided in the program syntax. Then, assuming our body typechecks to another type , this becomes the return type of the function, so its entire type is .
Note a subtlety here: is given to us from the program, while we have to compute . Our typed lambda calculus mixes types that are explicitly annotated and implicitly inferred.
Lastly, the rule says: when calling a function, the function expression should be a function , and the argument expression should be of the appropriate argument type . Then, the result of applying the function is the result of the function, type .
As an example of these rules, here is the proof that .
Proving type safety
At this point, you should understand the mechanics of our type system: how we define our typing rules, and how they can be used to construct proofs about the types of expressions. But it’s not sufficient just to have a type system, we need a good type system! Remember, we want to demonstrate that if a program is welltyped, then it will never enter a stuck state. To do that, we have to prove the progress and preservation theorems.
Rule induction
Last time, we talked about structural induction, and how an induction principle could be derived from the structure of a grammar. For example, if numbers are defined as:
Then our induction principle becomes:
Another way to see the mapping from the grammar to the induction principle is to rewrite the grammar in terms of inference rules.
The BNF notation is essentially an alternative way of saying the same thing. We define a judgment that says whether something has the number syntax. From this notation, we want to derive an induction principle for universal properties: . To make such a principle, we can ask: what are all the ways that could have been derived? Then for each possible derivation, we have an induction hypothesis on the assumptions and a desired proof on the conclusion.
Here, because can be derived two ways, each contributes a case in the induction, either or . This is the same as before, but rewritten so as to see how an inductive principle can be derived from inference rules as opposed to a grammar.
More generally, this idea is called rule induction, making inductive principles from inference rules. And we can apply the same idea to our typing judgment. If our theorem says “if then ”, we can prove by showing it inductively true for all derivations of . Specifically:
Induction principle for static semantics: for all such that , then if:
 TNum: Assume . Show .
 TBinop: Assume and . Show .
 TVar: Assume and . Show .
 TLam: Assume and and . Then show that .
 TApp: Assume and and . Show .
Preservation proof
First we will prove preservation by rule induction on the static semantics:
Preservation: if and then .
Note that the progress and preservation theorems are defined with respect to “closed terms”, i.e. expressions which don’t need a type context at the top level to prove their type. Or put another way, expressions with no free variables. We show that here by saying that the fact must be proved with no context .
Proof. By rule induction on the static semantics.

TVar: if and then .
This is vacuously true, since , so the typing context cannot be empty. Our theorem says the typing context must be empty, so our condition is never met.

TNum: if and then .
This is vacuously true, since there is no rule to step a number .

TBinop: if and then .
First, by the premises of the rule, we know and .
Second, by the inductive hypothesis (IH), we get to assume preservation holds true for and . For example, if then we know (and likewise for ).
Third, we case on the three ways in which a binary operation can step:

DBinop1: assume , so . By the IH, , so by we have that .

DBinop2: assume and , so . By the IH, , so by we have that .

DBinop3: assume and and , so . By we have that .
Hence, in every case, we have shown that for all possible , and the preservation theorem holds for .


TLam: if and then .
This is vacuously true, since there is no rule to step a function value.

TApp: if and then .
First, by the premises of , we know and .
Second, by the IH, we know that preservation holds for and .
Third, we case on the two ways an application can step:

DAppStep: assume , so . By the IH, , so by , we know .

DAppSub: assume , so .
By the inversion of ($\text{TLam}$), we know and .
By the substitution typing lemma^{6}, .
Hence, preservation holds in either case.

Since preservation holds for all typing rules, then it holds for the entire language.
Progress proof
Finally, we will prove progress:
Progress: if then either or there exists such that .
Proof. By rule induction on the static semantics.

TVar: if then either or there exists such that .
This is vacuously true, since a variable cannot have a type without a typing context .

TNum: if then either or there exists such that .
By DNum, .

TBinop: if then either or there exists such that .
First, by the premises of the rule, we know and .
Second, by the inductive hypothesis (IH), we get to assume progress holds true for and . For example, if either or .
Third, we case on the different possible states of and derived from the IH:

: then by DBinopL, .

: then by DBinopR, .

: because and , then by inversion on DNum we know and . Therefore by DBinopOp, for .
In each case, the expression steps, so progress holds.


TLam: if then either or there exists such that .
By DLam, .

TApp: if then either or there exists such that .
First, by the premises of , we know and .
Second, by the IH, we know that progress holds for and .
Third, we case on the different possible states of derived from the IH:

: then by DAppStep,

: then by inversion on DLam, . By DAppSub, .
In each case, the expression steps, so progress holds.

Since progress holds for all typing rules, then it holds for the entire language.

In the lambda calculus, all functions technically take one argument, so when I say “a function that takes one argument”, I mean as opposed to a function that returns another function. ↩

Except where built into the language, of course. In the
div
example, both of the asserted invariants (int types and nonzero) will be checked by the division operator in the language runtime. ↩ 
The options provided do not strictly form a dichotomy. “Gradual” or “hybrid” invariant enforcement that mixes static/dynamic checks is an active area of research, e.g. gradual typing. ↩

Why is the arithmetic necessary? Can’t we just keep our functionsonly approach? Unfortunately, no. Imagine the function and I asked you: what is the type of this function? At some point, you have to have a “base type”, since a type language of just is infinitely recursive, and you cannot construct an actual type. ↩

You can think about as a “purely functional” dictionary. Adding a new mapping like will overwrite a previous mapping for . ↩

You can assume if and then . ↩