Due: Wednesday, October 24 at 4:00pm
Submission cutoff: Saturday, October 27 at 4:00pm

Having absorbed critical mass of blockchain hype, you’ve finally decided it’s time to pad your resume (and your bank account) by implementing a cryptocurrency. Knowing that it will only work at web scale, you’ve decided to “Monetize Your Business With Your Users’ CPU Power” (not kidding) and implement the performance-critical portions in WebAssembly. Through this assignment, you will become primarily familiar with the essentials of implementing small programs by hand in WebAssembly.

Setup

Since WebAssembly is a web language, and because the best debug tools for it are in the browser, all code will execute in your browser. We strongly recommend Chrome for it’s debugging tools, but the assignment should run in any modern browser. To run this assignment, you first need to install Node.js, which can be downloaded from https://nodejs.org/en/.

In the f18-assignments repository, run git pull to get the assignment materials, located in the assign4 directory.

From the assign4/program directory, run npm install. Once that’s complete (it might take a while), run npm start. This will open up a local web server that automatically updates any time you edit your WebAssembly files. For this assignment, you only need to edit js/hash.wat.

Editor setup

You can find packages for WebAssembly syntax highlighting below:

Debugging WebAssembly

  1. Open Developer Tools (View → Developer → Developer Tools).
  2. Navigate to the Sources tab.
  3. Refresh.
  4. Find the code under wasm in the side bar. Each function is separated into its own source unit with a name like wasm-42020a46-0. The numbers at the end are the indices of the functions in hash.wat, e.g. index 0 is $adler32.
  5. Like in GDB, you can set breakpoints that flag the debugger to suspend execution when hitting a certain instruction. To set a breakpoint, click the line number of the instruction.
  6. Refresh to run the code and stop at that breakpoint.

    You should see a traditional debugging view with source on the left. The source view shows the stack based syntax with variable names stripped. On the right, you can see the current value of the global memory, the local variables, and the stack.

  7. Using the buttons in the top right, you can continue execution, step into the next function call, and so on.

Firefox

Firefox doesn’t seem to show either the stack values or memory, but you can still use the debugger to step through and watch local variables change.

  1. Open Developer Tools.
  2. Navigate to the Debugger tab.
  3. Refresh.
  4. Find the WASM code under wasm:// in the side bar.
  5. Set a breakpoint.
  6. Refresh to run the code and stop at that breakpoint.

1. Adler-32 hash (20%)

The core of any cryptocurrency is a good hash function. In classic Silicon Valley fashion, you’ve decided not only to hand-roll it yourself, but to pick something horrendously insecure: the Adler-32 hash. You should read through the linked page for a brief description of the hash, and more importantly for a reference C implementation.

In js/hash.wat, your task is to implement the $adler32 function:

(func $adler32 (param $address i32) (param $len i32) (result i32))

Note: in all functions on this assignment, you are free to add any number of local variables to the function definition, but you cannot add/remove parameters or results.

You are going to implement the function twice. First, using the linear/sequential stack-oriented syntax (in $adler32), and second using the nested tree-like syntax (in $adler32v2). Recall the linear syntax looks like:

(get_local $i)
(get_local $address)
(i32.add)
(i32.load8_u)

And the equivalent tree syntax looks like:

(i32.load8_u
  (i32.add (get_local $i) (get_local $address)))

See the Setup and Debugging sections for more information on executing the code, running tests, and analyzing the output. A few other notes:

  • To load a single byte from memory into a 32-bit integer, you can use i32.load8_u.
  • You will find the i32.shl (bit shift left), i32.rem_u (remainder unsigned), and i32.or (bitwise or) operations useful.

2. Memory allocator (60%)

Now that your hash function is rock-solid, you turn next to memory allocation. Billions of strings per second won’t allocate themselves, so you need a blazing fast memory allocator to keep up. For this task, you will implement a dynamic memory allocator with an implicit free list strategy.

If you aren’t already getting flashbacks to your computer systems intro course, here’s a brief reminder of how implicit free lists work.

A chunk of memory is allocated with a header, containing a length (how long is the pursuant block?) and a flag (is it free or used?). These blocks implicitly form a linked list, as the lengths allow you to skip from one block to the next. In the beginning, memory is a single, unallocated, contiguous block.

In js/hash.wat, we have provided you with two functions:

(func $alloc_init)
(func $free (param $address i32))

$alloc_init is guaranteed to be called once before usage of the memory allocator. It turns the initial memory page into a single free data block of size 65528 (65536 byte page minus 4 bytes for the block size and 4 bytes for the free flag). $free takes a user’s pointer to memory and frees it by setting the allocated flag to 1. Your task is to implement the alloc function:

(func $alloc (param $len i32) (result i32))

$alloc is called with a length to allocate memory and return a “pointer” to the user (note that a pointer is just a number, an index in memory), and $free is called on a “pointer” to return the memory block to the allocator.

A couple notes on WebAssembly semantics and expected invariants.

  1. Memory in WebAssembly is allocated in chunks of the page size, or 64 KiB (or 65536 bytes). You can assume you will always have exactly one page of memory allocated initially. You don’t need to handle memory growth.

  2. For the header format, you should use an i32 for the length and an i32 for the allocated flag (in that order), for a total of 8 bytes in the header. See $alloc_init for reference.

  3. $alloc should walk the free list until it finds a large enough block that is free, split it in two, and return the first part. Make sure both the returned block and the newly created block have their headers updated accordingly. If there isn’t enough space to split the block, then the block size should be preserved. See tests segmentation edge case and segmentation edge case 2.

  4. If the current memory has no room for a requested allocation, $alloc should raise a critical failure (i.e. unreachable in WebAssembly).

  5. You do not need to implement block coalescing, or any advanced free list strategies to reduce fragmentation (but you can’t just implement a bump allocator!).

If you’re not sure how to get started, here’s an (almost complete) reference implementation in C that you can use as a baseline. Try filling out the block splitting code, and then translating the C into WebAssembly.

#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <assert.h>

#define u8 uint8_t
#define i32 int32_t
#define PAGE_SIZE (16384 * sizeof(i32))

u8 MEM[PAGE_SIZE];
i32 load(i32 address) { return *(i32*)(MEM + address); }
void store(i32 address, i32 value) { *(i32*)(MEM + address) = value; }

void alloc2_init() {
  store(0, PAGE_SIZE - 8);
  store(4, 1);
}

void free2(int addr) {
  store(addr - 4, 1);
}

int alloc2(int len) {
  int addr = 0;
  int cur_len, nextblock;

  // Check each possible block
  while (1) {
    // Critical failure if we're out of memory
    if (addr >= PAGE_SIZE) {
      exit(1);
    }

    cur_len = load(addr);

    // Use the current block if it's free and log enough
    if (load(addr + 4) == 1 && cur_len >= len) {
      store(addr + 4, 0);

      // Split the block if it has enough space
      // ... implement this yourself!

      return addr + 8;
    } else {
      // Skip to the next block otherwise
      addr = addr + 8 + cur_len;
    }
  }
}

int main() {
  alloc2_init();
  int p1 = alloc2(12);
  int p2 = alloc2(36);
  free2(p1);
  int p3 = alloc2(12);
  assert(p1 == p3);
  return 0;
}

3. Formal exceptions (20%)

Frustrated with all the bugs in your code, you’ve decided it’s time to petition the WebAssembly committee to add exceptions to WebAssembly in order to simplify error handling. After significant thought, you came up with this novel syntax:

For example, this could be used like:

(func $div (param $m i32) (param $n i32) (result i32)
  (if (i32.eq (get_local $n (i32.const 0)))
    (then (raise (i32.const -1)))
    (else (i32.div_s (get_local $m) (get_local $n))))

(func $main (result i32)
  (try
    (i32.div 5 0)
    (catch
      (i32.const -1))))

To implement exceptions, we are also going to need to add an administrative instruction “raising” that contains the value being raised.

Your task is to provide a semantics for the above constructs that fully define the behavior of raise with respect to the set of features we formalized in class. In English, the semantics should be:

  1. When a raise is encountered, the top value on the stack should be inserted into a raising instruction (1 rule, provided below).
  2. For try/catch blocks:
    1. A try/catch block should attempt to step its body when possible. The try/catch block does not define a new stack—the inner body should step with the outer stack (1 rule).
    2. If the body executes without error, then the try/catch exits without incident (akin to labels) (1 rule).
    3. If the body raises an exception, then the try/catch should enter the catch block with the value of the exception on the top of the current stack (1 rule).
    4. Breaks, returns, and traps should propagate across try/catch blocks (3 rules).
  3. Exceptions should propagate across labels and frames (2 rules, 1 provided below).

For example, here’s two freebie rules that define how raise turns into raising and how raise interacts with labels:

Since we didn’t cover static semantics, you only have to define the dynamics, not the statics, nor do you have to provide a proof of correctness.

Submission

To submit your work for the programming section, upload src/hash.wat to Gradescope.