CS 242: Final: NFSTAR

Overview

In this project you are tasked with creating a secure, verified Network File System (NFS) implementation in the F* functional programming language. F* is a new language from Microsoft Research aimed towards program verification, or formally proving that a program adheres to a high-level specification. For example, Project Everest is a full HTTPS networking stack implemented in F*.

You can think about program verification as kind of like abstract testing. On Assignment 8, you probably found the compile-fail testing fairly tedious—to rigorously test a state machine, there’s a huge combinatorial explosion of ways one can get it wrong. Ideally, we could write a “model checker” that would take a description of the shopping cart state machine, and the Rust code implementing it, and provide a formal proof as to whether the Rust code correctly implements the state machine. Rust does not have a sufficiently expressive type system to easily do this, but languages like F* and Lean do! Hence, the main learning goal of this project is to learn a verification-oriented language to implement an interesting piece of systems software, a networked file system.

There are three primary components to the project:

Identify the security policies that should be enforced for each piece of the application (given our high-level policies).
Write the security policies in F*.
Build an implementation (in F*) that adheres to the policies.

Note that an implementation that compiles is only guaranteed to adhere to the security policies you’ve written, it does not guarantee it implements the functionality of the NFS client/server. Therefore, in addition to the above, you will need to write a test suite to verify the functionality of the NFS client/server. The test suite should include unit tests for each function exposed through the public interface, in addition to more complex multi-user scenariors (more on that below).

Getting started

This guide has been written for Linux users, however it may be possible that some of the steps apply to other OSes as well. F* is a Microsoft project, so you should not have any issues getting up and running on Windows (although some steps may differ).

In particular, I used a Stanford Rice machine (Ubuntu 16.04) when installing this software, so if you cannot get it working on your own system you are always welcome to use the Rice machines where it will be guaranteed to work.

Installing OCaml

If you do not already have OCaml installed on your machine from the earlier parts of this course, you can follow the lab to get setup. This is a required step to getting F* running.

Install the dependencies required by F*:

opam install ocamlbuild ocamlfind batteries stdint zarith yojson fileutils pprint menhir ulex ppx_deriving ppx_deriving_yojson process

Installing F*

Go to the F* site and download the appropriate tar.gz. For Linux users you can use the following:

wget https://github.com/FStarLang/FStar/releases/download/V0.9.7.0-alpha1/fstar_0.9.7.0-alpha1_Linux_x86_64.tar.gz

Unpack the archive:

tar -xvzf fstar_0.9.7.0-alpha1_Linux_x86_64.tar.gz

Set FSTAR_HOME and add the fstar bin directory to the path in ~/.bashrc:

export FSTAR_HOME="<PATH_TO_FSTAR_ROOT>"
export PATH="$PATH:$FSTAR_HOME/bin"

# EXAMPLE:
# export FSTAR_HOME="$HOME/CS242_FINAL/fstar"
# export PATH="$PATH:$FSTAR_HOME/bin"

# Source ~/.bashrc to apply changes to the current shell:
source ~/.bashrc

cd into the fstar directory, and build ulib (this will take a while!):
```
make -C ulib install-fstarlib
```
At this point you should be good to go! Try building/running hello world:
```
cd examples/hello
make hello
./hello.exe
```

Editor

We highly encourage the use of an F* supported text editor for this project. Currently only Emacs and VS Code are actively supported. You can find setup information here.

F* tutorial

Before continuing further, I highly encourage you to walk through the F* tutorial. The first section describes how to implement a simple access control policy, and section 10.1 describes how to implement a more complex stateful access control system.

Documentation

F* is a minimally documented language, so most of the information about the interfaces for basic data structures will come from reading the source code. Googling fstarlang <FEATURE> should yield a link to the implementation file for the feature in the GitHub repo (assuming it exists). A big part of this project is learning to read the F* source files (it’s very similar to OCaml).

Here are links to the implementation files for List, OrdSet (total order on elements), and OrdMap (total order on keys):

https://github.com/FStarLang/FStar/blob/master/ulib/FStar.List.fst
https://github.com/FStarLang/FStar/blob/master/ulib/FStar.OrdSet.fst
https://github.com/FStarLang/FStar/blob/master/ulib/FStar.OrdMap.fst

Functionality

NFSv4 defined in RFC 7530 is the reference protocol for this project. The RFC defines the interface for the NFS client and the NFS server. You will be implementing a subset of this protocol, focusing primarily on basic file operations and access control. We have defined the minimal set of operations you should support below, organized by operation type. Note that for each of these operations the RFC covers many more cases than you need to handle, it is up to you to determine which parts are relevant for a viable minimal implementation.

In order to support access control, every operation should be run under a user context. The easiest way to accomplish this is to explicitly pass in a user identity to every function. You can assume a fixed set of users in the system, and that both the clients and servers know about all users and user groups (no need to handle user registration/authentication). At the start of a server session, all users and user groups will be provided to the server. When a client (i.e. user) connects to the server a unique (per-user) root file handle should be returned, which the client will use for subsequent calls to the server.

File operations

OPEN: Create/open a file (client specifies whether they are creating a new file). The file handle passed refers to the parent directory and a fresh file handle will be returned. The server should maintain state on the opened file, requiring an explicit CLOSE to cleanup this state. OPEN could either be given a user ID or group ID.
CREATE: Given a directory file handle, create a new directory under it and return the new file handle to the directory. CREATE could either be given a user ID or group ID.
CLOSE: Close the file (or directory) associated with the file handle.
READ: Takes in a file handle, offset into the file, and number of bytes to read and returns the corresponding content to the client.
WRITE: Given a file handle, an offset, and data, server should write the data to the file.
REMOVE: Remove the given object (e.g. file/dir) from the system. You do not need to worry about reference counting and symlinks/hardlinks.
LOOKUP: Given a directory file handle and a name of an object within that directory, update the file handle to point to the contained object (if it exists). This can be used for traversals.
LOOKUPP: Given a file handle, set the file handle to be the parent of the current object.
READDIR: Given a file handle to a directory, return a list of the names of all objects inside the directory.

Access control operations

ACCESS: Given an object, return the permissions the user has on the object.
SETATTR: Given an object (i.e. handle), update the object attributes according to the request (e.g. give another user/group read and/or write access to the file/directory, or revoke access). This deviates slightly from the RFC, as we want to define a rich multi-user access control system. Server should verify that the user/group has permission to modify the attributes of this object. At a minimum you should be able to grant users/groups read access, write access, and attribute modify access. By default any file/directory created by a user/group is owned by them (they have all permissions: “open”/”read”/”write”/”attribute modify”). SETATTR could either be given a user ID or group ID.

F* tips

Here are a few tips when working with F*:

Type annotate as many expressions/functions as possible. If you hit a compiler error, the first thing you should check is that the expression/function is type annoated.
Use private liberally on internal functions and types that should not be exposed to the client/tests.
Use OrdMap and OrdSet NOT Map and Set (the latter are not concrete data structures). Note that OrdMap and OrdSet can only operate on totally ordered keys/elements so it’s best to use integer identifiers instead of strings or other types. The starter code provides examples of their usage.

High-level security policies

The main security model we will be exploring in this project is Access Control. We will only consider adversaries that are users in the system attempting to violate the secrecy of data they do not have permission to view, or the integrity of data they do not have permission to modify. You will not need to protect against external threats to the system.

At a high-level your system should ensure the following:

Server should verify that all received operations for an active file handle are from the user that the file handle belongs to.
Server should confirm that any operation on an object adheres to the permissions for the given user.
- Each object should have a set of user IDs and group IDs which have permissions to perform the given operation.
- It is up to you how you design your particular implementation of group IDs as long as they fall within the specification.
- For the sake of simplicity, you can assume that groups are flat, i.e. a group cannot belong to another group. Only users can be members of groups, and a user can belong to multiple groups at once.

F* specifies effects and verifies operations and results through pre-conditions and post-conditions. These are crucial to F’s type inference system. From the F tutorial:

Type inference in F* is based on a weakest pre-condition calculus for higher order programs. For every term e, F* computes WP e, which captures the semantics of e as a predicate transformer. If you want to prove some property P about the result of e (i.e., a post-condition), it suffices to prove WP e P of the initial configuration in which e is executed (i.e., a pre-condition).

Take a look at Section 9 of the F* tutorial for more details.

You will use pre-conditions to statically verify function parameters adhere to policies (e.g. the user has permission to read the file). See the section above that covers a guided walkthrough on how to use pre-conditions to enforce a policy.

Defining and using a security policy

The starter code provided defines two policies: one is for a user having a permission on a file, and the other is for a file existing. In the starter code the former policy is used on OPEN to ensure the user has open permission on the file, and the latter is used when creating a file to ensure it does not already exist. The policies are statically enforced using pre-conditions on the openInternal and createFile' functions, requiring that the given inputs adhere to the policy (more on that below). As these are statically verified, if the code compiles it means that it is impossible for openInternal to be given a user without open permission on the file, and for createFile' to be called on a file path that already exists.

Note that what will be discussed below closely follows the example given in section 10.1 of the F* tutorial, so refer there for more details.

Policy on file creation

Let’s look at the file creation function and policy from the starter code in more depth to get a feel for how this all works (full code can be found in NFS.Server.fst).

First, the file permission policy itself:

(* Policy that file exists. *)
private let fileExists (s : system_state) (f : file_path) =
  match getFileId' s f with
  | Some file_id -> (
    OrdMap.contains file_id s.file_paths &&
    OrdMap.contains file_id s.file_contents &&
    OrdMap.contains file_id s.file_permissions)
  | None -> false
private type fileExists_t f h =
  fileExists (Ref.sel h system) f == true

This code (in the let) takes in a system state object and a file path, and simply returns true if the file exists in the system and false if it does not. Simple enough! As you can see, defining the policy is quite intuitive.

The fileExists_t type takes in a file path and the program heap, and then calls fileExists with the system state selected from the heap using Ref.sel h system along with the file path. system is a reference (like a pointer) to a system_state object on the heap, so normally to dereference it you can use !system, however in this particular case we will be proving conditions abstractly on a generic heap (not the current program heap) so we have to explicitly select it from the heap that was passed in. The == true (different than = true!) is used to convert the fileExists boolean return value (true or false) to the type True or False (see section 3.5 of the tutorial for more details).

Now let’s look at the file creation function:

(* Create file.
 *
 * Requires that file does not already exist.
 *)
private val createFile' : (f : file_path) -> is_dir -> ST unit
  (requires (fun h -> ~(fileExists_t f h)))
  (ensures (fun h x h' -> True))
private let createFile' file_path is_dir =
  let file_id = getFreshFileId () in
  let s = !system in
  system := { s with
    file_paths = OrdMap.update file_id (file_path, is_dir) s.file_paths;
    file_contents = OrdMap.update file_id "" s.file_contents;
    file_permissions = OrdMap.update file_id OrdMap.empty s.file_permissions;
  }

Ignoring the type annotation (private val ...) and focusing on the definition, we see that it gets a new file identifier, gets the system state (by dereferencing the system ref), and then sets the system ref to a new system state with updated file paths, file contents, and file permissions for the new file.

Now for the interesting part, let’s break the type annotation down line by line:

private val createFile' : (f : file_path) -> is_dir -> ST unit

Here we define a private function called createFile' that takes in an input file_path called f, an is_dir (just a bool that’s true if it’s a directory), and returns a unit. All functions have an associated effect that indicates what side effects can be caused by the function, in this case it’s the ST effect which means it can manipulate the heap (in this case we are reading/writing the system ref whose backing object lives on the heap). See section 2.2 in the tutorial for more details on effects.

  (requires (fun h -> ~(fileExists_t f h)))

This is what actually enforces the policy. requires is the pre-condition for the function, meaning that the condition must be true when the function is called. It takes a function which takes a heap as input, and then returns a True or False type based on whether the condition is true given this heap. In this case we are ensuring the file f does NOT exist under the heap h (hence the negation of the condition with ~). ~ is a type negation on True/False, whereas not is the regular boolean negation.

  (ensures (fun h x h' -> True))

This is the post-condition, which takes in the old heap, an x value, and the new heap, and allows enforcement that some modification occurred on the heap from executing the function. Here we just set it to always be True.

Due to the precondition, it is impossible for createFile' to be called with a file path that already exists! However, this means that if there are any calls to createFile' where the compiler cannot statically determine that the file path is not already created in the system, it will cause a compile time error.

To ensure the compiler knows that createFile' will only be called when this condition is met, we write a wrapper createFile that ensures this condition, throwing an exception if it is not met.

(* Create file if it does not exist. *)
private val createFile : file_path -> user_id -> is_dir -> All unit
  (requires (fun h -> True)) (ensures (fun h x h' -> True))
private let createFile file_path user_id is_dir =
  (* Create the file if it does not already exist. *)
  if fileExists !system file_path
  then failwith "file already exists"
  else createFile' file_path is_dir;
  ...

Here we explicitly call and check fileExists with the dereferenced system state, which ensures we only call createFile' if the condition will be satisfied, thus the compiler can statically verify the pre-condition is always met (as we never call createFile' elsewhere).

Note the function effect, All, which is used when the function can both cause an exception (failwith) or change the heap (createFile'). For both ST and All you are always required to include a pre-condition and post-condition, but you can just set them to True in cases like this.

Startup

We have provided you with starter code in nfstar/program/. NFS.Server.fst contains the code for the NFS implementation. NFS.Server.Test.fst contains code for testing your implementation. We have implemented MOUNT and OPEN for you already. However, the current functionality does not include permission checking for user groups. Your job is to implement the rest of the functions as well as design a specification for enforcing user group permissions.

To run the initial starter code, compile your program by running make in the nfstar/program directory. Then run the compiled executable nfs_tests.exe to run the tests. All given tests should pass with the initial starter code.

Test suite

Each function should have a corresponding set of unit tests that exercise the basic functionality and test edge case inputs. Additionally, you should add tests to verify the following scenarios execute properly:

Alice creates file and writes to it, Bob attempts to read from it (fails), Alice grants Bob read permission, Bob attempts to read (succeeds). (also write equivalent tests for write and attribute modify as well).

Alice creates the following filesystem directory structure, and subsequently traverses it using LOOKUP/LOOKUPP/READDIR.

ROOT (virtual)
|-> Foo
    |-> Foo_File_1
    |-> Foo_File_2
    |-> Foo_Dir
        |-> (empty)
|-> Bar
    |-> Bar_Dir_Level_1
        |-> Bar_Dir_Level_2
            |-> Bar_Dir_Level_3
                |-> Bar_Level_3_File_1
            |-> Bar_Level_2_File_1

The starter code test file, NFS.Server.Test.fst, provides examples of using runtime assertions on boolean checks (rassert), as well as asserting that exceptions are raised when functions are called with bad inputs (rassert_exn).

Going above and beyond

There are four main logical extensions to the above baseline requirements that we encourage you to explore (listed by difficulty, easiest first). You may receive up to 15% extra credit for completing an extension.

Expand the functional domain of your implementation to include more features from the NFS v4 protocol spec.
Use the cryptographic primitives provided by Project Everest to encrypt the data on the server at rest (even better if each user’s data is encrypted with their own key).
Implement a FUSE interface for your NFS client.
Build a true multi-node NFS implementation that uses the Project Everest reference TLS implementation for encrypted communication between the client and server.

Grading

Write-up (15%): We will be looking for detailed responses to the questions below.
Test suite (30%): We will evaluate the test suite based on the breadth of features and uses cases covered, as well as the depth for each of the functions tested (make sure to cover the edge cases!).
NFS Server implementation (55%): We will be looking for complete implementations of each of the above functions, including fully fleshed out security policies applied to functions such that the functions are statically verified to comply with the policies.

Write-up and submission

Please submit a project write-up which includes the following:

Instructions on how to run your code and test suites. Instructions shouldn’t be more than one or two commands to run, so please use test harnesses and wrapper scripts as needed.
Outline your design and the policies that you enforced for each of the functions you implemented. Particularly, how did you enforce user permissions for each function? How did you design your user group functionality?
Describe the unit tests which you have written and what aspects of your functionality they test.
What challenges did you face writing code in a verification oriented language? At a high-level, did you learn anything new that will influence the way you write code in a more standard language? Specify when you fell back on imperative paradigms to fill in any knowledge gaps you might have had.
What are the positives of implementing NFS in a language such as F* which verifies functional correctness and security? What additional precautions must you take when your language cannot verify correctness and security?
(optional) If you chose any of the components from “Going above and beyond”, please describe what you implemented.

Submit your writeup PDF and source code on Gradescope.

Sections