So far in this course, we have largely avoided questions of control flow: how do you know which part of the program to execute next? Partially this is because an expression-oriented computation model doesn’t need a notion of instructions and sequencing—there’s only the current expression to reduce. This is also because we simply haven’t had time to discuss many exciting control flow features like exceptions and continuations, only if-expressions. Today, we will look at two kinds of control flow common to imperative languages: loops and function call stacks.

Branches

Recall from the static semantics discussion last time that control flow naturally has difficult interactions with a type system. For example, the lambda calculus if-expression type rule:

This rule must conservatively dictate that both the “then” and “else” branches always return a value of the same type. Because we cannot know which branch will be taken, downstream code that relies on the value of the if-expression should work in either case.

Similarly for WASM, if we allow arbitrary jumps to different points of the program like normal assembly, there is no way to ensure that the value stack is properly formatted for an operation. Consider an imaginary (br i) operation that jumps instructions forward or back. We could write:

(i32.const 1)
(i32.const 2)
(i32.add)
(br -1)

This program would compute the add, put 3 on the value stack, then attempt to compute add again. This time, the stack has only one value on it, not two, which puts the program into a stuck state.

Examples

Conceptually, the core issue with statically analyzing control flow is that you need to be able to guarantee that every instruction will be executed in an appropriate context. Control flow constructs like if, while, and even GOTO (with a fixed destination) are all potentially type-checkable because we can build a graph of control flow at compile-time. We can immediately rule out data-dependent (or “indirect”) jumps, because you can’t statically know where the jump will go (and that’s how you get ROP attacks).

Even for static constructs like the (br -1) in the previous example, we have to be careful to avoid jumping into an incorrect context. In WebAssembly, we will solve this issue by carefully designing a language and typing discipline of branch points that can be statically verified. Before diving into the formalities, I’ll give you an intuition for how this works. Consider a toy C program that infinitely increments a variable.

void inf_incr() {
  int x = 0;
  while (true) {
    x += 1;
  }
}

We can write this equivalently this program in the following WASM code:

(func $inf_incr (local $x i32)
  ;; x = 0
  (i32.const 0)
  (set_local $x)

  (loop $incr_loop
    ;; x+= 1
    (get_local $x)
    (i32.const 1)
    (i32.add)
    (set_local $x)

    (br $incr_loop)))

Here, the loop construct defines a new label (or branch point) $incr_loop which we can use the br instruction to unconditionally branch to. The label incr_loop is at the top of the loop, so jumping to it restarts the loop sequence. Unlike normal assembly languages that are completely sequential, the loop construct is parameterized by a sub-list of instructions. This is because the loop defines a label scope, i.e. a range within the $incr_loop label can be accessed. You can’t, for example, br $incr_loop outside of the loop.

This snippet also introduces function-local variables. For now, just know that get_local reads the variable and set_local assigns to it. We’ll discuss a formal semantics for this later in the lecture.

Extending the example, let’s say we want to break out of the loop once x == 10. In C, one way to implement this is with a break:

void inf_incr() {
  int x = 0;
  while (true) {
    x += 1;
    if (x == 10) { break; }
  }
}

In WASM, we write the break as:

(func $inf_incr (local $x i32)
  ;; x = 0
  (i32.const 0)
  (set_local $x)

  (block $incr_loop_break
    (loop $incr_loop
      ;; x += 1
      (get_local $x)
      (i32.const 1)
      (i32.add)
      (set_local $x)

      ;; if (x == 10) { break; }
      (get_local $x)
      (i32.const 10)
      (i32.eq)
      (br_if $incr_loop_break);

      (br $incr_loop))))

Here, we use two distinct labels: one to define an exit point (incr_loop_break), and one to define a loop point (incr_loop). Then we conditionally branch to the exit point using br_if if x is 10. The intuition between block vs. loop is that a block defines a label at the end of the block, meaning br $incr_loop_break jumps outside of the loop to the end of the block. By contrast, loop defines a label at the beginning of the block, so br $incr_loop goes to the first instruction within the (loop) construct.

One subtlety with WASM labels is the interaction with the value stack. Within the scope of a given block or loop, instructions are executed in the context of a completely fresh value stack that is thrown away at the end of the scope. The only way a block/loop can affect the program is through stores to locals or memory. For example, this program is invalid:

(i32.const 0)
(i32.const 1)
(block $b
  (i32.add))

Because the 0 and 1 are only on the value stack outside the block, not inside. By contrast, this is a valid program:

(i32.const 0)
(block $b
  (i32.const 2)
  (set_local $x))
(get_local $x)
(i32.add)

This is a simplification of the real WASM semantics, but is close to how actual WASM implementations work today. As of 2019, most browser WASM engines permit blocks or loops to return a single value.

Nameless WASM

Before we formalize these ideas, we need to briefly revisit the idea of de Bruijn indices. In practice, writing WASM with names for locals and branches is good software engineering. But it complicates our formalism, so for now we’re going to pervasively adopt a nameless representation of locals and branches. Here’s two equivalent functions that demonstrate the concept:

(func $foo1 (param $x i32) (param $y i32) (local $i i32)
  (block $loop_end
    (loop $loop_start
      (get_local $x)
      (get_local $y)
      (i32.add)
      (set_local $i)

      (get_local $i)
      (br_if $loop_end)

      (br $loop_start))))

(func $foo2 (param i32) (param i32) (local i32)
  (block
    (loop
      (get_local 0)
      (get_local 1)
      (i32.add)
      (set_local 2)

      (get_local 2)
      (br_if 1)

      (br 0))))

The second is actually valid WebAssembly, because once WASM is compiled out of human-readable form, all names are replaced by de Bruijn indices.

In this snippet, the reference to set_local $i becomes set_local 2 because the function parameters/variables are numbered left-to-right starting at 0. The br_if $end becomes br_if 1 because there is one label between the usage of $end and its declaration in the block.

Static semantics

Now to formalize branching control flow, we’re going to introduce the following instructions into our WASM grammar:

$$ \begin{alignat*}{2} \msf{Expression}~e ::= \qamp \ldots \\ \mid \qamp \wblock{\star{e}} \qqamp \text{label at end} \\ \mid \qamp \wloop{\star{e}} \qqamp \text{label at start} \\ \mid \qamp \wbr{i} \qqamp \text{branch} \\ \mid \qamp \wbrif{i} \qqamp \text{conditional branch} \\ \end{alignat*} $$

Here, quantifies a number just like , just used notationally to indicate the presence of an index instead of an arbitrary number.

Remember that our type system defines an instruction in terms of its effects on the value stack: inputs and outputs. Because blocks and loops define new local value stacks, they have no effect on their containing context. Moreover, we expect that the instructions in the block and loop start with no values on the stack, and end with no values on the stack. Formally, we write these conditions as:

Note: we will extend these rules below to interact with branches correctly. This isn’t their final form!

Next, we have to define types for branches. In the lambda calculus, an if expression had a value corresponding to either its then or else branch. In WASM, as in imperative languages, a branch simply changes the instruction pointer, it doesn’t correspond to an actual value.

Rather than define the static semantics for branches right now, let’s defer this discussion until we understand better their runtime behavior. Then we can revisit the typing question.

Dynamic semantics

The main concept of loop and block is defining a label, which can be branched to with a br or br_if. The difference is what happens after you branch—block exits nested scope, and loop enters the top of the nested scope. To represent this idea at runtime, we will introduce a label instruction. In WASM terminology, label is an “administrative” instruction, meaning one that is only used by the language runtime and not intended to be written by the user.

$$ \begin{alignat*}{2} \msf{Expression}~e ::= \qamp \ldots \\ \mid \qamp \wlabel{\star{e_{\msf{cont}}}}{(\star{n};~\star{e_{\msf{body}}})} \qqamp \text{branch point} \end{alignat*} $$

A label defines a local stack , which is a value stack only accessible in code executing inside the scope of the label. This comes with two sequences of instructions: which is the current set of instructions being executed, and which is the “continuation” of label, or the sequence of instructions that will be run if the label is jumped to. Our dynamic semantic rules for block and loop simply translate them into this common label form:

A block is a label with an empty continuation, meaning after branching to a block, nothing else happens. A loop is a label with a loop continuation, meaning after branching to a loop, the loop starts over. (The loop semantics should look suspiciously close to a fixpoint definition!)

A label’s semantics are to execute its body in the context of its local stack until completion. If a branch occurs to the label, then the label should enter the continuation.

The D-Label-Step rule is our first example of a nested step in WASM. The intuition behind the rule is that to represent the nested evaluation of a label, we can take our current configuration and construct a new one where the instructions and stack are replaced by the ones inside the label. Once this steps, we propagate the changes to the label. However, if our label changes mutable state e.g. memory, we preserve those changes as well. The notation is the same as in OCaml, meaning “the record , except with the following fields changed”.

In the simplest case, when a label reaches the end of its nested instructions, the label is thrown away as in D-Label-Done. Let’s see a short example of these instructions in action (without branches):

I abbreviate the rule names because the browser LaTeX renderer goes wild with longer equation line tags. Sorry for the inconvenience.

Finally, we return to branches. A branch returns control flow to the target label, and then executes the label’s continuation. Recall that branches are parameterized by the de Bruijn index of the label. If we are branching to the current label, the rule is:

If the current instruction in the current label is a branch with index 0, then throw away the label and replace it with the continuation. If the index is not 0, then we want to continue propagating the branch up the label stack.

Finally, we have a conditional branch. Not having booleans in the language, we use to represent false, and to represent true.

In BrIf-True, we have to explicitly remove all the remaining instructions in the current sequence. They would never get executed anyway due to how br works, but this step ensures that the subsequent configuration can still type-check (i.e. maintain preservation).

Typing branches

Now that we’ve established the dynamic semantics of branches, let’s return to the question of static semantics. How can we type-check these instructions? First, an unconditional branch br. This instruction effectively throws away whatever is currently on the value stack, and jumps to a label continuation, which always expects as input. Hence, we can write its type as:

One consequence of this rule is that it prevents the sequencing rule from being applied around a branch. For example:

(block
  (i32.const 2)
  (i32.const 3)
  (br 0)
  (i32.add))

This does not type check, since br 0 consumes the 2 and 3 on the value stack. By contrast, let’s consider conditional branching with br_if. Extending the example:

(block
  (i32.const 2)
  (i32.const 3)
  (i32.const 0)
  (br_if 0)
  (i32.add))

This examples does type-check, because if the conditional branch fails, then the remaining instruction i32.add will be executed. So our br_if rule looks more like a normal operator that only removes a single number (the boolean) from the stack:

One question remains. What happens if we branch to a label that doesn’t exist? For example, a naked br 0 without a surrounding block or loop has no defined semantics. This is like using an undefined variable. To avoid this case, we finally make use of our proof context to track how many labels are in scope, and then check that a branch is valid. Formally, we represent the proof context as:

And then update all previous rules to use this context:

The loop and block rules increment the label counter when type-checking the nested instruction sequence, and the branch rules check that the requested label is in scope.

Functions and locals

At last, we can complete our formalization of WASM by adding functions, local variables, and modules. At the top-level, a complete WASM program is a module containing a series of functions. For example:

(module
  ;; return x/y if y != 0, return 0 otherwise
  (func $div (param $x i32) (param $y i32) (result i32 ) (local $z i32)
    (block $earlyreturn
      (get_local $y)
      (i32.const 0)
      (i32.ne)             ;; y != 0
      (br_if $earlyreturn) ;; leave before returning if y != 0
      (i32.const 0)
      (return))            ;; otherwise early return 0

    (get_local $x)
    (get_local $y)
    (i32.div_s)              ;; return x / y
    )

  (func $main
    (i32.const 10)

    (i32.const 2)
    (i32.const 3)
    (call $div)   ;; div(x=2, y=3)

    (i32.store)   ;; mem[10] = div(x=2, y=3)
    ))

Functions have zero or more parameters and zero or one outputs, as indicated by the param and result indicators. A function internally has zero or more local variables, which are named mutable slots that live for the duration of the function call. A function implicitly returns the value at the top of the stack if it exists normally. The return instruction will cause an early return from the function using the value at the top of the stack.

We can add each concept to our grammar as follows:

$$ \begin{alignat*}{2} \msf{Expression}~e ::= \qamp \ldots \\ \mid \qamp \wgetlocal{i} \qqamp \text{get frame local} \\ \mid \qamp \wsetlocal{i} \qqamp \text{set frame local} \\ \mid \qamp \wcall{i} \qqamp \text{function call} \\ \mid \qamp \wreturn \qqamp \text{function return} \\[1.2em] \msf{Function}~f ::= \qamp \wfunc{\star{\tau_\msf{param}}}{\star{\tau_\msf{local}}}{\tau_\msf{ret}}{\star{e}} \\[1.2em] \msf{Module}~m ::= \qamp \wmodule{\star{f}} \end{alignat*} $$

Like for labels and branches, both local variables and functions are referenced by a number rather than by name. Variables are numbered left to right starting at 0. A function definition contains the types of parameters and local variables along with the function body. A module is then a sequence of functions.

We make the simplifying (and functional!) assumption that a function always has a return value/type.

Static semantics

Now that our program consists of a module of functions (rather than a single instruction sequence), we need to restructure our typing discipline. How can we talk about the type of a function or module? How can we safely reference functions and locals inside an instruction sequence (e.g. get_local or call)?

At last, we will make use of the proof context . We need two pieces of information: what is the type signature of all the functions in the module, and what is the type signature of the current function? We’ll write this as:

$$ \begin{alignat*}{2} \msf{Context}~\ctx ::= \qamp \{\wclab{:}~n;~\msf{funcs}{:}~\star{f};~\msf{curfunc}{:}~f^?\} \end{alignat*} $$

The notation indicates that is an option type, meaning it is not guaranteed to be in an arbitrary context . In the rules below, if we reference unconditionally, there is an implicit requirement that contains a corresponding current function.

To construct a proof context, we need typing rules for modules and functions. Because modules are not being used as first class objects (and we’re not dealing with export/import semantics), we don’t need to define an explicit type for them, just validate that all their components are well-typed. We write this as a valid judgment:

These rules are wordy, but the basic point: a module is valid if all its functions are valid. When checking a module, we build the initial context containing all the function definitions, then check each function individually. To check a function, we check that the function’s body type-checks when starting with no arguments, and ultimately returning the return type. When type-checking the body, we set to the current function.

With the context in place, we can now define rules for locals, call, and return. Locals are fairly simple:

Calling get_local i takes no parameters (because i is embedded in the instruction, not on the value stack) and returns the type of the local, which we lookup in the current function context. Conversely, setting a local variable takes an argument of the local’s type off the stack, and returns nothing.

Call and return are more interesting:

If we call a function, then the call instruction has a type equal to the function’s type signature. For example, in the case of the add example, call $add has type i32, i32 -> i32, meaning the arguments are removed from the value stack, and the return value is pushed onto the stack.

Unlike br, the return instruction has a return type . Whereas blocks and loops all had the type , a function with an early return still needs to prove that its body has a return type , for example:

(func $foo (result i32)
  (i32.const 0)
  (return))

If return had a type , then it would be impossible to prove that foo has type .

Dynamic semantics

To implement the dynamic semantics for functions, we need to add two pieces of runtime state: function locals, as well as a reference to the containing module if a function is called. Our new configuration becomes:

$$ \begin{align*} \msf{Config}~C ::= \qamp \wconfig{m}{\star{n_{\msf{mem}}}}{\star{n_{\msf{locals}}}}{\star{n_{\msf{stack}}}}{\star{e}} \end{align*} $$

At runtime, we have a module (set of functions), a memory (sequence of numbers), some locals (also a sequence of numbers), a value stack, and lastly the current instruction sequence. Our semantics all revolve around the question: what happens when you call a function? How can we encode a call stack of functions in our semantics?

In the same way we introduced a label construct to represent a stack of labels, we are also going to introduce an administrative instruction for representing a stack of functions.

$$ \begin{alignat*}{2} \msf{Expression}~e ::= \qamp \ldots \\ \mid \qamp \wframe{i}{C} \qqamp \text{Function call frame} \end{alignat*} $$

A function frame represents a function currently being executed along with its local state. Each frame contains a complete configuration describing how to execute the function, as well as the index of the corresponding function. The call instruction creates a new function frame:

In English, this rule says: when we want to call a function , look up the corresponding function definition in the module, . Then create a new configuration with the same module and memory, but create a new set of locals: parameters gotten from the stack, and 0-values for local variables. Start with an empty value stack in this new configuration, and a set of instructions taken from the function definition. Lastly, place this configuration into a call frame instruction.

As with labels, we then define rules to recursively execute instructions within a function frame:

These rules say, respectively: when we’re calling a function, if the function body still has work to do (it can step), then step its configuration. Once the function has no more work to do, i.e. the instruction sequence is empty, then take the return value at the top of the inner configuration, and put it on the stack of the outer configuration. Also copy over any changes to memory (but not locals!).

Lastly, we have semantics for early returns:

Return grabs on to a number at the top of the stack, and propagates it across any labels until reaching the closest function frame. Once the return reaches a frame, it removes all remaining instructions, ensuring that D-Frame-Done applies in turn.

Progress and preservation

Now that our WASM subset is completely defined, we return to the question: at this point, what does type safety look like? Recall from last lecture that we defined progress and preservation with respect to a judgment , meaning a configuration produces values of type on the runtime stack.

  • Progress (old): if then either or .
  • Preservation (old): if and then .

There are two issues introduced in this lecture that conflict with these definitions:

  1. This time around, we’re executing programs in the context of a module, and a type-check on a configuration will need access to the module to e.g. type-check any call instructions.
  2. We added administrative instructions which, although not part of the user’s program, will at some point appear in a configuration’s instruction stream, so those must be type-checkable.

To solve the first problem, we will reformulate our P&P theorems in the context of a module. Formally:

Let be a module such that and let . Then:

  • Progress (new): if then either or .
  • Preservation (new): if and then .

Unlike the lambda calculus, we do not start with a closed term (i.e. the P&P theorems don’t start with no context like ) because the module always defines an execution context. We reuse the same T-Config rule to permit type-checks on configurations:

To solve the second problem, we add typing judgments for the label and frame administrative instructions.

These rules a little tricky, because we mutually recursively invoke the T-Config rule in the premises. The basic issue is that label and frame are the only constructs that fundamentally change the structure of the runtime, e.g. by changing the value stack or the locals. Hence we have to use our T-Config judgment to check that the label/frame nested instructions are valid in the presence of the corresponding value stacks.

T-Label says: a label has no arguments or returns. Its body should type check under the current label value stack, and its continuation should also type check given an empty stack. Both the body and the continuation should return nothing.

T-Frame says: a call to function returns a value of type . We check the inner configuration assuming that the current function is , so that way any check to an instruction like get_local will look up a variable of the appropriate type.

Ideally at this point, we would walk through a complete proof of progress and preservation for the entire language. Just like the authors of the original WebAssembly paper, I haven’t had time to put it together. And just like the original paper, it’s possible I did something totally wrong! In fact, the mechanized proof of P&P identified multiple points of incorrect or undefined behavior in the original paper. If you can identify a violation of progress and preservation in the WASM subset I’ve created, I’ll give you +5 points on your next assignment.

Difference from specification

At this point, we’ve covered enough semantics to implement all of the core of WebAssembly. That’s no small feat! However, in the interest of time and to focus on concepts over details, the semantics presented both exclude some features and present others in a different way than the standard specification. Features we excluded:

In terms of alternate presentation of the semantics, ours differs in two main ways.

  • The value stack. The WebAssembly spec actually just uses the instruction sequence as the value stack. While this is maximally simple (fewer runtime structures needed), I think it’s a little confusing, and simpler from a pedagogical standpoint to have an explicit value stack.
  • Labels. The spec uses this formalism of a block context to express the reduction of branches. This reduces the total ink in the spec, but I think is less clear than incremental branch reductions (breaking one label at a time). The semantics presented in this lecture were more inspired by the official WebAssembly interpreter (written in OCaml!).