This post is about my ongoing master’s thesis under Jeremy Avigad at Carnegie Mellon University, in which I’m trying to tackle formal verification of Rust programs in the interactive theorem proverLean, and a first result of the project: a complete verification of the Rust stdlib’s binary search function.

## Putting the ‘Formal’ into ‘Formaldehyde’

Formal Verification is the act of mathematically reasoning about a program’s behavior. While there have been gargantuan verification projects like a verified C compiler or a verified microkernel in the recent past, it is still mostly a research topic and is rarely being applied to programs that have not explicitly been written for verification.

Part of the reason for this is that it’s quite complicated to apply mathematical tools to something unmathematical like a functionally unpure language (which, unfortunately, most programs tend to be written in). In mathematics, you don’t expect a variable to suddenly change its value, and it only gets more complicated when you have pointers to those dang things:

“Dealing with aliasing is one of the key challenges for the verification of imperative programs. For instance, aliases make it difficult to determine which abstractions are potentially affected by a heap update and to determine which locks need to be acquired to avoid data races.”

While there are whole logics focused on trying to tackle these problems, a master’s thesis wouldn’t be nearly enough time to model a formal Rust semantics on top of these, so I opted for a more straightforward solution: Simply make Rust a purely functional language!

## Electrolysis: Simple Verification of Rust Programs via Functional Purification

If you know a bit about Rust, you may have noticed something about that quote in the previous section: There actually are no data races in (safe) Rust, precisely because there is no mutable aliasing. Either all references to some datum are immutable, or there is a single mutable reference. This means that mutability in Rust is much more localized than in most other imperative languages, and that it is sound to replace a destructive update like

p.x += 1

with a functional one – we know there’s no one else around observing p:

let p = Point { x = p.x + 1, ..p };

Likewise, we can turn a mutable borrow like

let x = f(&mut p) // f(&mut Point) -> A

into copying (or moving) the value, and copying it back when the borrow ends:

let (x, p) = f(p) // f(Point) -> (A, Point)

Just these rules are sufficient to make a big part of Rust programs functionally pure. There are obvious (borrowing a field) and non-obvious (returning a mutable reference) extensions to this basic idea, which I may discuss in a future post about what parts of Rust my project currently supports, what it may support in the future, and what it will never support.

I’m certainly not the first one to point out this semantics-preserving transformation – see for example this slide by Alan Jeffrey from his talk at LOLA2016. But what I have used this realization for is to implement a tool that transpiles (a subset of) safe Rust into Lean, a purely functional language, in order to verify Rust programs with the same tools as ordinary definitions in Lean itself. This is what I call ‘functional purification’, and may excuse me reusing the name of an unrelated Firefox project . After all, electrolysis is a chemical process that can be used for the purification of iron oxide (i.e., rust).

## Verifying the Correctness of `std::slice::binary_search`

So much for the theory for now, what does this look like in practice? For a first test, I decided to verify Rust’s binary search function. It may not be the most interesting algorithm, but Rust’s implementation is quite unique in that instead of using two simple indices like everyone else, it uses high-level slice operations that let it circumvent bounds checking without resorting to unsafe code. For my project, high-level means a dozen (indirect) dependencies and some advanced language constructs like traits, closures and pattern matching, which turned out to be a nice test case for the tool.

After about two weeks of working on the proof (on second thought, don’t look at that) and tweaking the transpiler in parallel, I managed to prove the following theorem:

theorem core.slice.binary_search.sem : ∀ {T : Type} [Ord' T] (self : list T) (needle : T), sorted le self → option.any binary_search_res (binary_search self needle)

Hey, don’t run away just yet! Let me step you through it: This theorem says that given

- any type
`T`

- that implements some mysterious type class
`Ord'`

,

- that implements some mysterious type class
- any slice
`self`

- that is sorted according to
`Ord'`

- that is sorted according to
- and object
`needle`

of type`T`

,

calling (the Lean translation of) `binary_search`

- terminates normally (in contrast to
*abnormally*, i.e. panics, or not at all) with a value`some res`

- such that
`binary_search_res res`

is true (that’s what the`option.any`

is for).

The `binary_search_res`

predicate is a 1:1 translation of `binary_search`

’s doc comment :

inductive binary_search_res : Result usize usize → Prop := -- if the value is found then Ok is returned, containing the index of the matching element | found : Πi, nth self i = some needle → binary_search_res (Result.Ok i) -- if the value is not found then Err is returned, containing the index where a matching -- element could be inserted while maintaining sorted order. | not_found : Πi, needle ∉ self → sorted le (insert_at self i needle) → binary_search_res (Result.Err i)

As an aside, if you look at (fully automated) verifications of binary search in other languages, you will find that these often concentrate on the first case, even though `not_found`

is actually the more complex one, and only on the special case of integer arrays at that. This is where, in my view, interactive theorem provers really shine, by being able to use high-level abstractions like `sorted`

or `insert_at`

in specifications for arbitrary user-defined orders.

Which brings me to the last part, the `Ord'`

type class, which combines the Rust `Ord`

trait (that gets translated to a Lean type class) and the Lean `decidable_linear_order`

type class, meaning implementations of the Rust trait must also fulfill the axioms of a linear order. Rust and Lean definitions, working together in harmony!

To conclude, given that the repo contains a machine-checked proof of this theorem, I hope you trust my assertion that `binary_search`

is, indeed, correct! Well, by trusting me, you’re more specifically trusting all of these parts:

- the Rust compiler, whose internal API my tool uses (mostly its excellent mid-level IR )
- Lean itself
- my transpiler
- my prelude of language primitives
- also contains manual translations of functions like
`mem::swap`

the transpiler doesn’t like because they use unsafe code- there’s also one slyly hiding in the config file

- also contains manual translations of functions like

So yes, the project does involve some non-negligible trust base. Still, I would argue that an error in any of these components is far more likely to prevent you from proving something that holds than to enable you to prove something that *doesn’t* hold.

The most likely kind of translation bugs may be not in the basic Rust semantics, but in representing side effects. Indeed, if you like at the prelude, you may notice that currently there’s no modelling of overflows – all integer types are unbounded on the Lean side. I’m still torn on this design issue, but tend to think that overflow checking is a poor fit for the kind of algorithmic verification proofs I’m striving for. It can generate very unintuitive preconditions (‘you can’t call `len()`

on this recursive data structure for more than 2^64 elements’) that may never be dispensed with because they depend on the program’s input. In any case, overflow checking is not very interesting for the implementation at hand because every integer in there is trivially bounded by `self.len()`

.

## Related Works

Reading ‘Rust’ and ‘verification’ in the title, you may first have thought of the RustBelt project at MPI-SWS and wondered how the two projects relate to each other. As far as I know, the cornerstone of RustBelt will be a full formalized semantics of Rust – exactly what I’m trying to avoid to do –, on top of which they will be able to reason about abritrary unsafe Rust code – something that is an explicit non-goal of my project. Thus I feel like the two projects should complement each other quite nicely. At some point, you may even be able to use the RustBelt formalization to prove electrolysis’s purification correct.

## (Speculating about) Future Work

I haven’t decided yet what to prove next, which is the real reason I started writing this post: to get suggestions from the Rust community about code in or outside of `std`

that may be interesting to verify, without using too many dependencies or crazy `unsafe`

tricks. For example, a first candidate may be `sort`

, but that one uses loads of unsafe code and the algorithm isn’t even that interesting compared to implementations in other standard libraries. `binary_search`

proved to be much more well-behaved in this regard.

Instead of avoiding code that does impure `unsafe`

things, another approach would be to extend the transpiler so that it supports at least some well-behaved examples of impurity. Pointers aren’t that bad, for example, as long as their use is localized to a single function. More generally, you may know that Haskell uses monads to tame both local ( `ST`

) and global ( `IO`

) impurity. Well, I’m already using the `option`

monad to model non-termination. Replacing this monad with different ones (or whole monad stacks) may enable the transpiler to express more effects or other kinds of behavior like nondeterminism.

As a first simple example of this, I experimented with hiding a step counter inside the monad to derive a bound on `binary_search`

’s time complexity – which turned out to be slightly more complex if you don’t assume that comparisons will be constant-time and you don’t have a model of Big O notation for multiparametric functions. But let me postpone discussing that to a potential future blog post instead of making this one even longer. Last time I checked, my university wasn’t quite ready to accept theses in blog form yet.