CS 242: Smart pointers

Now that we’ve covered the two core unique features of Rust, borrow checking and traits, the next two weeks are going to look at the many ways in which these features interact to produce new styles of type-safe systems programming. Today, we’re going to look at how we can bypass the borrow checker to implement different styles of memory management.

Raw pointers

Although not frequently used, Rust has primitive types for raw pointers written as *const T and *mut T. These are exactly like pointers in C, in that they are literally just memory addresses with no checks on memory safety. For most Rust users, these types are useful when writing interfaces out to non-Rust code, but for us they will serve as the building blocks for customized memory management.

For starters, you can turn any reference (compiler-managed pointer) into a raw pointer (unmanaged pointer) by coercing the types.

let x: i32 = 1;
let xptr: &i32 = &x;
let xraw: *const i32 = xptr as *const i32;

let mut y: i32 = 2;
let yptr: &mut i32 = &mut y;
let yraw: *mut i32 = yptr as *mut i32;
let yraw2: *mut i32 = yptr as *mut i32;

Wait a second. This looks like we have multiple mutable pointers to the same piece of data! That should be disallowed by the compiler, but the above is a totally legitimate program. The reason becomes more apparent when we attempt to use these pointers:

println!("{}", unsafe { *yraw });
unsafe { *yraw = 3; }

In Rust, raw pointers are allowed to violate all norms of memory safety, but subsequently any usage of them must be wrapped in an unsafe block, as above. This doesn’t “turn off the borrow checker” so much as say “raw pointers are allowed to be used here.”

Boxes

To see how we can use raw pointers for memory management, let’s first consider the most basic distinction for memory: stack vs. heap. In Rust, by default, all objects live on the stack. No matter how complicated the type, it will live on the stack.

struct Foo { a: i32, b: bool, c: &'static str }
enum Bar { A, B(i32) }

fn main() {
  // All stack allocated!
  let foo = Foo { a: 0, b: true, c: "Hello world" };
  let bar = Bar::B(3);
  let bar2 = &bar;
}

This is just like C, which has the same semantics for stack allocation. The difference is that in C, in order to allocate on the heap, you write explicit calls to malloc (and subsequent calls to free). In Rust, heap allocation is handled by Box. Box is a struct that takes a value, allocates space for it on the heap using malloc, and then automatically frees it when it goes out of scope.

fn main() {
  let x: i32 = 2;
  let y: Box<i32> = Box::new(x);
  println!("{}", *y);
  // y is freed when it exits
}

Using raw pointers and a few handy traits, we can implement a simplified Box ourselves.

use std::alloc::{alloc, Layout};
use std::{ptr, mem};

struct Box<T> {
  ptr: *mut T
}

impl<T> Box<T> {
  fn new(t: T) -> Box<T> {
    let ptr = unsafe {
      let layout = Layout::from_value(&t);
      let mut ptr = alloc(layout) as *mut T;
      ptr::copy(&t as *const T, ptr, layout.size());
      ptr
    };
    mem::forget(t);
    Box { ptr }
  }
}

First, we declare a box to be a wrapper around a raw mutable pointer to an arbitrary type T. To create a box, we take a Rust-owned value and allocate a piece of memory the same size. We us mem::size_of along with alloc and Layout, which is Rust’s slightly safer wrapper around the underlying memory allocator (technically not malloc but jemalloc). Then we ptr::copy (equivalent to memcpy) the value onto the heap. The mem::forget says “don’t try to call destructors on t, just forget about it.”¹ Lastly, we return a box of the allocated pointer.

In sum, this has now moved the provided object into memory, but we still need to support two operations: dereferencing and deallocating. For this, we return to traits. In Rust, there are several built-in traits that serve to overload certain operations. For example, the Add trait overloads the + operator. For example (from the docs):

use std::ops::Add;

#[derive(Debug, PartialEq)]
struct Point {
    x: i32,
    y: i32,
}

impl Add for Point {
    type Output = Point;

    fn add(self, other: Point) -> Point {
        Point {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}

assert_eq!(Point { x: 1, y: 0 } + Point { x: 2, y: 3 },
           Point { x: 3, y: 3 });

For memory management, we have the Deref which allows us to overload the * operator. This allows us a dereference on a box to be “passed through” to the inner value, implemented like this:

impl<T> Deref for Box<T> {
  type Target = T;

  fn deref(&self) -> &T {
    unsafe { &*self.ptr }
  }
}

The expression &*self.ptr looks a little wonky. Shouldn’t &*x == x for all such pointers? Here, the expression means: dereference the raw pointer, then make a safe pointer (reference) to that raw value. For example:

let x: i32 = 1;
let xbox: Box<i32> = Box::new(x);
let xptr: &i32 = xbox.deref();
println!("{}, {}", *xbox, *(xbox.deref())); // These mean the same thing!
                                            // Compiler auto-inserts .deref()

In a complete implementation for box, we would also override DerefMut for mutable references. Confusingly, this implementation of the deref trait also makes references work:

let x: i32 = 1;
let xbox: Box<i32> = Box::new(x);
let xptr: &i32 = &xbox; // NOT &Box<i32>
println!("{}", *xptr);

This interaction, amongst the many strange behaviors involved in the question “when I do dereference or not?”, are due to deref coercion. This is a policy in the Rust compiler that attempts to minimize the number of explicit dereferences required when dealing with pointers, as detailed in the docs and in The Book. Generally it means: you rarely need to dereference pointers explicitly, and taking references does the “right thing”, e.g. &xbox becoming &i32 and not &Box<i32>. You don’t need to memorize these rules, but I want to raise your awareness of them so you can form a consistent mental model of pointer usage in Rust.

Lastly, we need to implement a destructor, or a piece of code that can deallocate our raw pointer once the box is no longer used. For this, we use the Drop trait. Similar to destructors in object-oriented languages, objects can run custom code right before they are deallocated by the compiler. For our box:

impl<T> Drop for Box<T> {
  fn drop(&mut self) {
    unsafe {
      let layout = Layout::from_value(&*self.ptr);
      mem::drop(ptr::read(self.ptr));
      dealloc(self.ptr as *mut u8, layout);
    }
  }
}

Here, we use dealloc (equivalent to free) to free the inner ptr, and the Rust compiler takes care of destroying the Box struct itself. And that’s it! Now we have a barebones implementation of a smart pointer, or a pointer wrapper with extra intelligence to handle allocation strategies, dereferences, and frees.

Observe that although Box presents a safe interface, it has an unsafe implementation. This concept arises from time to time when using Rust to work with low-level system primitives like raw memory, and is a critical part of good system design—clearly identify the “trapdoors” out of your type system with unsafe code, and then verify that the safe code is consistent with ownership rules assuming the unsafe code doesn’t violate ownership. In practice, a small number of highly scrutinized libraries distribute unsafe functions, and library users can avoid any usage of unsafe in their own code.

Reference counting

Although boxes required unsafe memory management in order to transition from the stack to the heap, their usage is still essentially the same as owned values. A value of type Box<T> has one variable owning the T, which is the the variable that owns the box. Move semantics still apply, so transferring ownership of the box works as expected:

let b: Box<i32> = Box::new(1);
let b2 = b;
// println!("{}", *b); <- invalid! b has moved
let b3 = &b2;
// let b4 = b2; <- invalid! can't move b2 while borrow is active

One limitation of ownership is that when we have multiple references to a particular value, the references cannot outlive the owner. This is the dangling pointer issue—I can’t return a pointer to a stack-allocated value, because the reference would live longer than the owner’s function scope. However, it can be convenient to describe pointer networks where it is unknown at compile time who will contain the last reference to a particular value.

This is fundamentally opposed to ownership though, which says that we must known at compile time exactly when an element will be destructed (i.e. when its owner goes out of scope without transferring ownership). Instead, we can create a new smart pointer that implements a runtime-checked destructor as a reference-counted pointer. A reference count is a track of how many variables have access to a particular value, and we destruct the value when the count reaches 0. Rust has this pointer type called Rc. The desired interface to an Rc pointer works like this:

fn make_rc() -> Rc<String> {
  let s1: Rc<String> = Rc::new(String::from("Hello"));
  let s2: Rc<String> = s1.clone(); // increment refcount, don't actually copy the string
  s2
  // decrement refcount when s1 dies
}

fn main() {
  let s2 = make_rc();
  // decrement refcount when s dies, and destroy the string
}

s1 and s2 should refer to the same string in memory, but even though s1 created it originally, the string only disappears once the last referent, s2, expires. Let’s see if we can implement this ourselves! First, we need a new method.

struct Rc<T> {
  count: *mut i32,
  value: *mut T
}

impl<T> Rc<T> {
  fn new(t: T) -> Rc<T> {
    Rc {
      count: Box::into_raw(Box::new(1)),
      value: Box::into_raw(Box::new(t))
    }
  }
}

This time, an Rc is both a raw pointer to the data itself and a raw pointer to the reference count. Rather than manually calling alloc and Layout like with Box, we can conveniently call Box::into_raw to heap-allocate and then convert a Rust-managed heap pointer into an unmanaged raw heap pointer.

This time, we need to introduce another key trait: Clone. We want to override the behavior of clone to not actually deep copy the object, but instead return the same pointers but increment a reference count.

impl<T> Clone for Rc<T> {
  fn clone(&self) -> Rc<T> {
    unsafe {
      *self.count += 1;
    }

    Rc { count: self.count, value: self.value }
  }
}

Conversely, we have to define Drop for decrementing the reference count when any given reference dies.

impl<T> Drop for Rc<T> {
  fn drop(&mut self) {
    unsafe {
      *self.count -= 1;
      if *self.count == 0 {
        drop(Box::from_raw(self.value));
        drop(Box::from_raw(self.count));
      }
    }
  }
}

Note that in order to destruct the value, we reconstruct the box with Box::from_raw, then immediately drop the newly minted box. This enables Rust to run its normal drop checks on the value in the box, i.e. ensuring the value in side the Rc pointer has its contents destructed. Lastly, we define the “pass-through” dereference the same as for Box:

impl<T> Deref for Rc<T> {
  type Target = T;

  fn deref(&self) -> &T {
    unsafe { &*self.value }
  }
}

Note that this time, we cannot implement DerefMut for Rc. That would permit code like this:

let x: Rc<i32> = Rc::new(1);
let y: Rc<i32> = x.clone();
let xptr: &mut i32 = &mut x;
let yptr: &mut i32 = &mut y;
// Bad! Two mutable references to the memory at the same time!
*x = 2;
*y = 3;

Hence, Rc only permits creating immutable references to its contents. And that’s it! A barebones implementation of a reference counted smart pointer.

Reference cell

It would be very inconvenient if Rc could only ever wrap immutable data. What if I want multiple places to have ownership over a shared value with both read and write access? That could work within our no-mutable-aliases rules if we could guarantee, somehow, that only one owner will be mutating the object at any given time. Rust’s ownership system won’t help us here, since each Rc pointer clone has full ownership of the underlying value.

Instead, we can define a final smart pointer that allows us to defer mutability checks from compile time to run time, incurring overhead but ensuring safety. That, in Rust, is a reference cell, or RefCell. RefCell has an interface like this:

let x: RefCell<i32> = RefCell::new(1);
{
  let xptr1: Ref<'_, i32> = x.try_borrow().unwrap();
  let xptr2: Ref<'_, i32> = x.try_borrow().unwrap();
  assert_eq!(*xptr1, 1);
  assert_eq!(*xptr2, 1);

  // If we have any immutable borrows active, we can't have a mutable borrow
  assert!(match x.try_borrow_mut() { Err(_) => true, Ok(_) => false });
}
{
  let mut xmut: RefMut<'_, i32> = x.try_borrow_mut().unwrap();
  *xmut = 3;

  // If we have a single mutable borrow active, no further borrows are permitted
  assert!(match x.try_borrow_mut() { Err(_) => true, Ok(_) => false });
  assert!(match x.try_borrow() { Err(_) => true, Ok(_) => false });
}

We’re not going to implement RefCell here because it’s a little too complex for this lecture. The core complication comes from the issue of: when someone borrows a reference cell (mutably or immutably), the cell needs to be notified when the borrow expires. If x.try_borrow() directly returned an &i32, there would be no way for the cell to know at runtime when &i32 goes out of scope. Instead, we use these Ref and RefMut structures that represent a reference to a value in a refcell, but whose drop methods will update the refcell when they expire.

Combining the Rc and the RefCell, this now allows us to obtain our original goal of deferring both destructor checks (through reference counting) and borrow checks (through reference cells) to runtime.

let x1 = Rc::new(RefCell::new(1));
let x2 = x1.clone();

// Temporary mutable borrow
{
  let x1mut = x1.borrow_mut();
  *x1mut = 2;
}

// Immutable borrow allows because `x1` borrow is ended.
println!("{}", x2.borrow());

In sum, we saw how smart pointers allow us to bypass the borrow checker to:

Move values onto the heap with Box.
Defer destruction to runtime with Rc.
Defer borrow checks to runtime with RefCell.

You can find these and other pointer types are summarized in the Rust container cheat sheet.

If our Box<T> contains a type that has pointers somewhere else, e.g. a Box<Box<i32>>, we don’t want to memcpy the box bits but then have Rust still destruct the original Box<i32>, meaning our memcpy’d version now points to invalid data. Here, mem::forget ensures that the destructors don’t run until we call them in Box::drop. ↩

Sections

Raw pointers

Boxes

Reference counting

Reference cell