Rust is memory safe and data race free. These are wonderful in themselves, but it turns out Rust’s unique combination of type system properties1 lets us guarantee even more about our software. In particular, by leveraging these properties, we can design sofware abstractions that enable functional correctness guarantees, and in many cases, be assured of these guarantees at compile-time.

Pushed to the limit, this idea would mean that if your program compiles, it is formally2 guaranteed to be correct: it does everything you expect, as you expect it, and nothing you don’t. While there exist type systems that actually deliver on this ideal3, Rust’s isn’t one of them. This is a good thing because, generally speaking, these type systems tend to be too complex to be seen as useful by most practitioners. By contrast, Rust has seen incredible traction4, and there are no signs of it slowing down. Because of this practicality, the ability to guarantee functional correctness properties is especially attractive. It thus behooves us to discover mechanisms and methods by which to accomplish this.

I’ve spent a large part of my last four years of research attempting to discover these mechanisms and methods, and I’d like to share with you what I’ve learned. I’ll do this by introducing three systems that I’ve built or am building towards these goals: Oxide, a programming language for writing formally correct smart contracts with static termination guarantees, Rocket, a web framework that prevents correctness and web security bugs at compile-time, and Osmium, a new operating system that automatically verifies the functional correctness of its device drivers.

You’ll find many commonalities between these systems. They’re all written in and heavily leverage Rust and its compiler. They all make use of Rust’s ownership, traits, and compiler extensions. And, most importantly, they all make writing correct, secure software the norm.

Overview

  • Oxide

    Provable termination and statically determinable upper-bound computation costs. Using types as proof witnesses.

  • Rocket

    Compile-time web security. Path directory traversal and access control violation prevention. Type category division using traits, enforcement using code generation, and witnesses through abstraction.

  • Osmium

    Automatic verification of device drivers. Perfect encoding of state machines: states as types. Type division using traits, witnesses through ownership and abstraction, enforcement and verification using code generation.

Oxide

Oxide is an in-development programming language for writing formally correct smart contracts. It is an embedded language: it merely extends the Rust type system, but otherwise, is Rust. Its primary goals are:

  • To defend against common pitfalls in smart-contract programming through good language design.
  • To enable formal verification of smart contracts by compiling to Coq.
  • To allow for the static computation of upper-bound computation costs.

For the purposes of this lecture, we’ll only focus on the latter-most goal, and then only briefly. The purpose of introducing Oxide is to demonstrate that types can serve as proofs, the idea being that the presence of a given type (otherwise known as a witness) directly allows formal statements about the surrounding program. In Oxide, we’ll enjoin that certain values have certain types. By ensuring that they do, we can then prove that the program terminates, and that it does so after a known bounded amount of computation.

Computation Cost

Given an arbitrary program, can you compute a reasonable5 upper-bound on the actual cost (say, in instructions or cycles) to run the program, independent of its inputs? In general, the answer is no, thanks in part to the halting problem. More generally, for any Turing-complete program (think backward branches and comparisons), it is unequivocally impossible to precompute the cost to execute the program.

As a brief thought experiment, what can you say about the time it takes to run the following function?

fn last(list: &LinkedList<usize>) -> Option<usize> {
    let mut last = None;
    for &val in list {
        last = Some(val);
    }

    last
}

Clearly, last is Θ(n = len(list)), but this cost is dependent on the input, so it tells us nothing about the actual cost to run the program for any input. What we’d like is a statement of the form: “This function will never execute more than K instructions.” for some constant K. That is, we want a fixed, upper-bounded, input-independent cost. Unfortunately, for this and many other Rust programs, it is impossible to make such a statement.

Being unable to make such statements is a big problem when the execution time of your program directly impacts your finances. It’s especially a problem when exceeding a certain computational threshold results in unbounded losses of money. This is exactly the case in smart contracts: if your contract executes more than the allowed number of instructions6 it is terminated, and you are responsible for paying for the wasted time. In the worst case, this results in “locking” money in a contract: because the contract cannot execute, it cannot release funds that it holds.

Typed Termination

All Oxide programs are guaranteed to terminate. This guarantee is provided by the type-system as opposed to the language. What we’ll see is that if we can control the type of all iterators, and if we control the implementation of those types, than we can guarantee that programs terminate and statically compute an upper-bound on their computation cost.

One way to rewrite the last function in Oxide is as follows:

static fuel F = 100;

fn last(list: &LinkedList<usize>) -> Option<usize> {
    let mut last = None;
    for &val in list using F {
        last = Some(val);
    } else {
        return None;
    }

    last
}

This introduces a few core Oxide concepts:

  1. fuel: a statically known constant
  2. using: all for loops must use a fuel; one unit of fuel is used per iteration
  3. else: all for loops must have an else branch which is executed when the used fuel is exhausted

The for loop in the last function iterates over at-most F values of list. Should there be more than F values, the else branch is executed. F is mutated internally by the iteration. On a subsequent call to last, the value of F is max(0, F - len(list)).

The program above uses for; using; else syntax sugar. In reality, it is compiled to the following Rust code:

static F: Fuel = Fuel(100);

fn last(list: &LinkedList<usize>) -> Option<usize> {
    let mut last = None;
    for &val in F.iter(list) {
        last = Some(val);
    }

    if F.just_exhausted() {
        return None;
    }

    last
}

Oxide instruments the Rust compiler, asking it to check that all iterators have the type FuelIter, the return type of F.iter(). The (pseudocode) implementation of Iterator for FuelIter is as follows:

impl<I: Iterator> Iterator for FuelIter<I> {
    type Item = I::Item;

    /// Produce a value using the internal iterator _as long as_ there is fuel
    /// remaining. If fuel is exhausted, always return `None`, terminating
    /// the loop.
    fn next(&mut self) -> Option<Self::Item> {
        if self.value > 0 {
            self.value -= 1;
            self.inner.next()
        } else {
            None
        }
    }
}

Putting the pieces together, Oxide ensures that:

  1. All values of type Fuel are static constant integers.
  2. Values of type FuelIter can only be constructed via a Fuel.
  3. Iterators of type FuelIter terminate after .value iterations.
  4. Critically, loop iterators only type-check if they have the type FuelIter.

As a result, all loops are guaranteed to terminate, and they will execute no more times than the initial value of the fuel being used. Thus, a reasonable (and sound) upper-bound on the execution cost of a loop bounded by fuel is simply fuel * {instructions in loop block}.

Takeaway

Oxide’s statically computable computation bounds are made possible by type-checking loop iterators. In particular, Oxide ensures that all iteration is done over values of the type FuelIter. Because its implementation of Iterator is known to terminate, we can thus say that all loops are known to terminate. Said another way, the type FuelIter acts as a witness to the fact that a loop terminates, allowing us to prove that all Oxide programs themselves terminate, and additionally to compute an upper-bound on the cost to execute the program.

You’ll find the idea of a witness pervasive in both Rocket and Osmium.

Rocket

Rocket is a web framework for Rust that prevents correctness and web security bugs at compile-time. Rocket is an open source project, counting tens of thousands of users, among them hundreds of companies.

Rocket leverages almost every property of Rust to achieve its goals: it is impossible to write a Rocket clone with the same correctness and security guarantees in any other practical language. For the purposes of this lecture, we’ll focus on Rocket’s use of traits, ownership, borrowing and lifetimes, and code generation.

Overview

A web framework is a software library for writing web applications. You’d use a web framework to write the back-end of web applications like Facebook, Instagram, Reddit, and so on. Web applications have the singular role of responding to web (HTTP) requests. A simplified7 HTTP request is made up of two components:

  1. A method: one of GET, PUT, POST, DELETE, and so on.
  2. A URI, which looks like /user/10/profile/friends.

When you visit a page like https://yourface.com/user/10/profile/friends, your browser makes a GET request to the server at https://yourface.com with a URI of /user/10/profile/friends. The server (the web application) is responsible for generating a suitable HTTP response and returning it to the client, in this case, the browser. Similarly, when you submit a form, such as when posting a status update, the browser sends a POST request to the server.

Routes

In Rocket, functions are annotated with route attributes to declare which requests the function should handle. The attribute and the annotated function are collectively known a route. As an example, to write a function that handles GET requests to /user/10/profile/friends, you’d write:

#[get("/user/<id>/profile/friends")]
fn friends(id: usize) -> Friends { /* .. */ }

Rocket handles the rest for you. In particular, Rocket automatically coverts dynamic URI components, those in <brackets>, into native values (here, id: usize), and it also converts arbitrary types into HTTP responses (here, Friends).

Guards

Whenever Rocket converts any input in the request into route input, it must first validate the input. For instance, in the friends route above, Rocket must ensure that the <id> in the request’s URI is a valid usize. Any type that can be used to validate and convert input is known as a guard.

In fact, Rocket requires, through compile-time checking, that all input passed to a handler is validated. That is, the only way to retrieve request input in a route is through guards. This is a powerful property as it ensures that all input undergoes validation. Historically, input validation has been the cause of expensive vulnerabilities.

To accomplish this, Rocket leverages Rust’s traits and code generation. Every attribute of the form #[get(..)] invokes a Rocket procedural macro, a piece of code that runs at compile-time, inspects the text of the annotated item, and generates new text to replace the annotated item.

Among other things, Rocket’s route attribute generates a validation routine for every guard. The validation routine invokes a trait method, requiring that each guard type implement a trait. Which trait a guard type must implement depends on the position of the guard in the route. Thus far, we’ve only seen one kind of guard, a paramater guard, which appears in the URI and has the syntax <param>. A second type of guard is known as a segments guard. These also appears in the URI but have the syntax <param..>. Unlike a parameter guard, segments guards match on more than one component in a URI.

As an example, consider the following route, which should return an arbitrary wiki page for a given user, assuming it exists:

#[get("/user/<id>/pages/<path..>")]
fn page(id: usize, path: PathBuf) -> Option<File> { /* .. */ }

Here, id: usize is a parameter guard and matches exactly one component of the URI, while path: PathBuf is a segments guard and matches all components after the 3rd. As such, a GET request to /user/20/pages/this/page would match with id as 20 and path as this/page.

For this route, Rocket generates code similar to the following:

fn generated_friends(request: &Request) -> Result<Response> {
    let id = usize::from_param(request.get_param(1)?)?;
    let path = PathBuf::from_segments(request.get_segments(3)?)?;
    Response::from(request, friends(id, path))
}

The get_param and get_segments methods on Request return the raw, unvalidated input from the incoming request. These inputs are validated using the trait methods FromParam::from_param() (for parameter guards) and FromSegments::from_segments() (for segments guards). This means that parameter guard types must implement the FromParam trait, while segments guards types must implement the FromSegments trait. If validation succeeds, and only if validation succeeds, the handler is called with the validated inputs.

Categories

Why does Rocket use different traits for different guards?

To illustrate why, assume that there is a single trait for guards, and consider the following route:

#[get("/user/<id>/pages/<path..>")]
fn page(id: String, path: String) -> Option<File> {
    File::open(format!("/var/data/user/{}/{}", id, path)).ok()
}

Ideally, this page route opens a file inside of a user’s /var/data directory. For security and privacy reasons, the route should not allow any other user’s files to be read. In other words, no files outside of /var/data/user/{id} should be readable by this route. Unfortunately, this route allows all files on the system to be read. Fortunately, Rocket will refuse to compile this route.

What goes wrong? The issue is that we’ve conflated two categories of inputs: the first, parameters, and the second, segments. A String parameter guard decodes8 the input as UTF-8 text. This is necessary and sufficient validation for single URI segments, but it turns out to be insufficient for validating multiple URI segments. In particular, without additional validation, the page route above is vulnerable to directory traversal attacks, which allow an attacker to access any file on the machine. An attacker can simply make requests of the form9 /user/0/pages/../../../etc/shadow, and the route will happily return the hashed passwords of all users on the system.

In turn, any validation of String suitable for segments is unlikely to be suitable for parameters. By requiring implementations of different traits for each unique context, we are able to create categories of types in the type system, leveraging the distinct trait implementations to differentiate between validation contexts.

In practice, Rust, given Rocket’s code generation, rejects the route because String does not implement FromSegments, making directory traversal attacks extremely unlikely10. The semantics of the String type don’t inform a particular implementation for FromSegments, so none exists, and Rust’s coherence rules make it impossible for a user to implement it themselves. What’s more, the implementation of FromSegments for PathBuf automatically sanitizes paths, making the PathBuf version of the route safe and correct.

Guard Transparency

Before we move on from Rocket, there is one more type of guard to discuss: the request guard. Unlike other guards we’ve seen, request guards don’t validate any direct input. Instead, they validate the entire request against some arbitrary policy.

To instruct Rocket that a route requires a request guard to validate, the request guard type need simply be included in the route’s parameters. For example, if a web application allows only administrators to delete user accounts, we can implement an Admin request guard that authorizes an administrative user and use it in a delete_user_account route. To implement the route itself, an underlying data access to actually delete the user will occur. Typically, the function implementing that data access will look something like:

fn delete_user(id: usize) -> Result {
    sql!("DELETE FROM users WHERE id={id}")
}

In other words, it will simply issue a query to a database that delete the user. The route then simply calls this function, ensuring that the Admin request guard appears in the handler:

#[delete("/user/<id>")]
fn delete_user_account(admin: Admin, id: usize) -> Result {
    delete_user(id)
}

Clearly, the delete_user function is extremely sensitive and should only be called after proper validation has happened. By using the Admin request guard in delete_user_account, we’ve ensured that this is the case.

But we can go a step further. If this was the only call to delete_user, our application would be correct. But what if we wanted to add a second route for deleting users?

#[delete("/users/<id>")]
fn delete_user_account_2(id: usize) -> Result { delete_user(id) }

If we forget to add the Admin guard to the route, we’ve now made it possible to delete a user without having administrative credentials! Our request guard isn’t very useful unless it’s being used. To resolve this issue, we simply need to guard the data access itself:

fn delete_user(admin: &Admin, id: usize) -> Result {
    sql!("DELETE FROM users WHERE id={id}")
}

As a result of this simple change, access control violations for user deletion are now guaranteed to be prevented at compile-time. This is because the &Admin parameter of delete_user is acting as a witness to a proof that an administrative user is currently authorized. This statement is true as long as:

  1. The delete_user function requires an &Admin type.
  2. The only constructor for an Admin type is FromRequest, the request guard trait.
  3. Only Rocket can construct via FromRequest.

Rocket guarantees the third point by leveraging abstraction (visibility rules). In particular, by making constructors to Request private, a value of which is required to call FromRequest, only Rocket can construct via FromRequest. The implementation must guarantee the first two points.

If all of these hold, then we know that if delete_user is called, there must be an active request authorizing an Admin. Note that this check is done at compile-time by the type-system. In other words, we’ve leveraged the type system (traits, abstraction) and code generation (request guards) to get a compile-time proof that our application is free from access control violations.

Osmium

Osmium is an in-development operating system that can automatically verify the functional correctness of its device drivers. Device drivers in Osmium are written and automatically verified following the formula below:

  1. Encode the device’s state machine using annotated types and traits.
  2. Provide introspection methods.
  3. Connect the target machine to a monitor.
  4. Press play.

Background: GPIO

We’ll use GPIO as a running example. GPIO stands for General Purpose Input/Output. As the name implies, GPIO is a general mechanism for transmitting data/signals into and out of some device through electrical pins, known as GPIO pins.

A GPIO pin can act as either an output or input. When a GPIO pin is acting as an output, it can either be set on or off. When on, the pin is driven at 3.3v. When the GPIO pin is off, no current flows through the pin. When a GPIO pin is acting as an input, it can be read to determine whether the pin is being externally driven at 3.3v or not.

State Machine Encoding

All hardware devices are state machines: they begin at a fixed, known state and transition to different states based on explicit or implicit inputs. The device exposes different functionality depending on which state it is in. In other words, only some transitions are valid in some states. Importantly, this implies that some transitions are invalid when the device is in a given state.

Most programming languages make it impossible to faithfully encode the semantics of a state machine in hardware, but not Rust! Rust lets us perfectly encode state machine semantics.

Below is the state diagram for a subset of a GPIO pin’s state machine:

Our goal is to encode this state machine in Rust. Let’s start by interpreting the diagram:

  • The GPIO starts in the START state.
  • From the START state it can transition to one of three states:
    1. ALT - no transitions are possible from this state
    2. OUTPUT - two “self” transitions are possible: SET and CLEAR
    3. INPUT - one “self” transition is possible: LEVEL

We’ll use Rust’s type system to ensure that a pin can only be SET and CLEARed if it has been transitioned to the OUTPUT state and the LEVEL read if it is in the INPUT state.

The Gpio structure below represents a GPIO pin in any given State.

pub struct Gpio<State> {
    pin: u8,
    registers: &'static mut Registers,
    _state: PhantomData<State>
}

The structure has one generic argument, State. Except for PhantomData, nothing actually uses this argument. This is what PhantomData is there for: to convince Rust that the structure somehow uses the generic even though it otherwise wouldn’t. We’re going to use the State generic to encode which state the Gpio device is in. Unlike other generics, we must control this parameter and ensure that a user can never fabricate it. This is identical in nature to how Rocket prevented construction of a Request type to ensure the integrity of request guards.

The pin and registers fields are internal implementation details used to implement the low-level components of the driver. They’re here for posterity, but otherwise will be largely ignored by our discussion.

Osmium exposes a state! macro that generates types (enums without variants) for representing device states. For Gpio, we use:

states! {
    Uninitialized, Input, Output, Alt
}

// Each parameter expands to an `enum` that looks like:
#[state] enum Input { }

This is also weird; why would we create an enum with no variants? Because we want to guarantee the integrity of our state machine encoding, it is imperative that any consumer of our encoding is unable to fabricate states. To this end, enums with no variants have a nice property: they can never be instantiated; they are type-level markers. A user can never pass a value of type Input (or Output, or …) because such a value can never be constructed. They exist purely at the type-level.

We can then implement methods corresponding to valid transitions given that a Gpio is in a certain state:

impl<S> Gpio<S> {
    /// Create a new `Gpio` pin in the initial `START` state.
    pub fn new(pin: u8) -> Gpio<Start> { .. }
}

impl Gpio<Start> {
    /// Transitions the pin from `START` to `OUTPUT`.
    pub fn into_output(self) -> Gpio<Output> { ... }

    /// Transitions the pin from `START` to `INPUT`.
    pub fn into_input(self) -> Gpio<Input> { ... }
}

impl Gpio<Output> {
    /// Sets (turns on) the pin.
    pub fn set(&mut self) { ... }

    /// Clears (turns off) the pin.
    pub fn clear(&mut self) { ... }
}

impl Gpio<Input> {
    /// Reads the pin's value.
    pub fn level(&mut self) -> bool { ... }
}

A Gpio is always constructed in the Start state; we again leverage abstraction to ensure that this is the case: there are no other public constructors. Furthermore, to obtain a Gpio in a different state, a user must call one of the transition methods. Finally, a Gpio can only be set and cleared when it is a Gpio<Output> and its level read when it is a Gpio<Input>. Perfect! But how do we actually transition between states, changing the type-level state generic, in these methods?

impl<T> Gpio<T> {
    fn transition<S>(self) -> Gpio<S> {
        Gpio {
            pin: self.pin,
            registers: self.registers,
            _state: PhantomData
        }
    }
}

This method lets us transition a Gpio from any state T to any other state S. Given a Gpio in state T (Gpio<T>), this method returns a Gpio in state S (Gpio<s>). Note that it works for all S and T. We must be very careful when calling this method, and we must again leverage abstraction to ensure this method cannot be called by a user. When called, we encode the specification of a transition in the state diagram. If we get the specification or encoding wrong, our driver is wrong. Soon, however, we’ll see that we can automatically check whether our implementation is correct or not.

To use the transition() method, we need to tell Rust which type we want as an output S in Gpio<S>. We do this by giving Rust enough information so that it can infer the S type. For instance, consider the implementation of the into_output method:

impl Gpio<Start> {
    pub fn into_output(self) -> Gpio<Output> {
        self.make_output(); // implementation detail
        self.transition()
    }
}

This method requires its return type to be Gpio<Output>. When the Rust type system inspects the call to transition(), it will search for a Gpio::transition() method that returns a Gpio<Output> to satisfy the requirement. Since our transition method returns Gpio<S> for any S, Rust will replace S with Output and use that method. The result is that we’ve transformed our Gpio<Start> into a Gpio<Output>.

Notice that the into_ transition methods take a Gpio by move. This means that once a Gpio is transitioned into a another state, it can never be accessed in the previous state unless it undergoes another valid transition. Without this property, we would not be encoding a state machine. Rust’s move semantics make this possible. As long as a type doesn’t implement Clone, Copy, or some other means of duplication, there is no coming back from a transition. No other language, not even C++, affords us this guarantee at compile-time.

Verification

For Osmium to automatically verify11 the driver’s correctness, we must inform it of the states and transitions for the given driver. We do this by annotating states with the #[state] attribute and transitions with the #[transitions] attribute. The state! macro already does this for states. We manually add the attribute to impls for transition methods:

#[transitions(start)]
impl<S> Gpio<S> { .. }

#[transitions]
impl Gpio<Start> { .. }

#[transitions]
impl Gpio<Output> { .. }

#[transitions]
impl Gpio<Input> { .. }

The special #[transitions(start)] attribute informs Osmium about the initial starting point of the state machine. From there, Osmium collects all of the states and transition methods, internally constructs a state machine diagram, and generates a verifier for the state machine. The generated verification code requires that the device’s type implement a special Introspection trait with the following signature:

trait Introspective {
    fn introspect(&self) -> impl State;
}

To verify the driver’s correctness, the verification code traverses the state machine. Starting with the initial state, after asserting that the device is properly in the initial state, the verifier:

  1. For every transition, forks an isolated process and runs the transition method.
  2. Ensures that introspect() asserts the device is in the transitioned-to state.

This process continues until there are no further unique transitions to make. Upon terminating, all reachable states and transitions will have been checked. As such, the software is incapable of driving the hardware incorrectly12, and the device driver is verified correct.

Conclusion

Building correct software is hard. Really hard. By leveraging type systems like Rust’s, we can pass the buck to the type system by encoding correctness properties with types. Projects like Oxide, Rocket, and Osmium leverage Rust’s properties - ownership, borrowing, code generation, linters, and abstraction, among others - to encode correctness properties in types in such a way that programs fail to compile if they violate these properties. While not perfect, these projects demonstrate the ability to leverage a type system for more than just memory safety and race safety. What other opportunities for software correctness are latent in strong type systems?

  1. Namely: abstraction, ownership, borrowing, lifetimes, traits, procedural macros, linters, in addition to memory safety and data race freedom. 

  2. That is, you can mathematically prove its correctness, and the proof itselfcan be mechanically checked for soundness. 

  3. See Coq, Idris, Agda, dafny, and F*, among others. 

  4. See Friends of Rust for a subset of companies using Rust in production. 

  5. We say reasonable because ∞ is always a correct though useless upper-bound. 

  6. In reality, blockchains like Ethereum measure and bound “gas” usage, which is a function of the instructions executed. 

  7. In reality, an HTTP request also consists of headers and a body. 

  8. In reality, it percent-decodes the input. 

  9. You’d need to use %2f instead of / in the attacking path. 

  10. Unfortunately, a user can always use the raw parameter interface, or worse, implement a custom and incorrect segments guard. 

  11. To verify software means to prove its correctness. As such, verified software is the subset of software that has been mathemetically proven to be correct. A device driver is correct when it correctly implements a subset of a device’s state machine: it drives the device from any valid state to any other valid state in the subset. 

  12. We’ve omitted an important discussion about verifying externally observable actions that are not reflected by state transitions. For instance, set() does not transition the device to a different state. We would like to verify, however, that the GPIO is being driven at 3.3v after set() is called. If it is not, the driver is buggy and incorrect. As presented, Osmium does not verify this. In reality, Osimum does (and must) verify this through a process known as full-loopback verification.