Paper review for –

seL4: Formal Verification of an OS Kernel

daveti@cs.uoregon.edu

Dec 10, 2013

How could we guarantee the code we are running is right? Even after different and long-time testing, we have yet not been so sure that there is no bug in the code. Moving forward, we may want to know if the OS kernel is behaving correctly as expected, as all other soft artifacts are based on the OS. Then we are inspired by the method widely used by the hardware design and verification – formal verification.

Formal verification is based on FSM. All the system states are modeled and the corresponding state transitions are analyzed. Temporal Logic (TL) or Computational Tree Logic (CTL) could be used to construct the model. As long as there is a prover for the target logic language, then there is a way to know if the whole system has some defects. Seems we have got the powerful tools we need, so what? Are we able to do formal verification for the Linux kernel then? The answer is no. The reason is straightforward, there is no way to describe a system based on FSM with more than 4 million lines of code (linux kernel 3.x).

Then what is the next? Apparently, we need to find a small OS kernel, which does not have so many crazy lines of code – seL4, a microkernel based on L4, with 8,700 lines of C code, becomes the natural selection. However, before we could start our formal verification for the seL4, at the very first, we have to give bug a formal definition. According to the paper, bug = deviation from the design. The definition is straightforward but implies an important prerequisite – the design should be correct! Then it starts clear now. To do a formal verification for the OS kernel, we have to prove that the design of the kernel is sound. How to do that? Still formal verification. A certain logic language would be used to depict the high level design of the kernel and the corresponding prover will be used to prove the design has not flaw.

What about the implementation? It seems we are missing something important. Even though we could assure that the design is right, how could we make sure –

1. the implementation is right and

2. the implementation fulfills/follows the design?

All the two points need some ‘golden value’ for reference and comparison. The high level design, however, is too abstract to connect with the real implementation. Actually, during the phase of developing a new OS kernel, there is another important document – specification (spec), which is used to instruct the real implementation. Theoretically, spec should reveal all the technical details about the implementation and the real implementation should strictly follow the spec on the other hand. Now, there are high level design (hld), spec and implementation (impl). Now what?

Here comes the idea/concept of refinement from the formal verification. In formal methods, program refinement is the verifiable transformation of an abstract (high-level) formal specification into a concrete (low-level) executable program. (from wikipedia). In a word, what is refinement and how could we define A refines B?

If for each transitions from s to s’ in M2, there exists the corresponding transition from sigma to sigma’ in M1 (The transitions correspond if there exists a relation R between the states s and σ such that for each concrete state in s′ there is an abstract one in σ′ that makes R hold between them again – from the paper). Then M2 refines M1.

The whole picture of doing formal verification is becoming clear. If imll refines spec && spec refines hld, then impl refines hld, because of transitive property of refinement. If we could do the formal verification for each layer (hld, spec, impl), which means we should have all the state transitions fro each layer, it should be a labor effort to verify the refinement. (NOTE, the paper did not talk about how to do the refinement but I doubt this may need human effort.) Anyway, let us start now.

hld

As Isabelle is used as the general prover, hld is written in high order logic (HOL) directly. Remember here hld just provides the high level behaviors of the kernel. There is no implementation details involved there. The hld for the kernel scheduler looks like below:

spec

The most important contribution of the paper would be way to do formal verification for this layer. A subset of Haskell is used to write the spec. Moreover, the spec has provided all the implementation details for the real implementation. You can imagine we have already implemented the kernel using Haskell (crazy, right?). Moving on, all the Haskell code for the spec could be then translated into HOL, which is provable by Isabelle. The spec for the kernel scheduler looks like below:

impl

The most interesting part of the paper would be the way to do formal verification for this layer. As GCC is not trusted and we have proved that spec is ‘perfect’ (being provable by Isabelle and providing all the implementation details in the Haskell code), is there a way to generate the C code from the Haskell code directly and automatically? Yes, of course. For details, please refer to the paper ‘D. Cock. Bitfields and tagged unions in C: Verification through automatic generation’. This amazing tool takes the spec as the input and generate the C code as the output. More important, it also generate the corresponding HOL code for proofs! The generated impl C code for the part of the kernel scheduler looks like below:

statistics

Below is the statistics for the whole formal verification for the OS kernel.

In summary, though we have spent only a short review to cover all the essence of this paper, one can imagine how difficult it would be to formal verification even for the small OS kernel with only 8,700 lines of code. The whole process has lasted for around 4 years with more than 10 developers devoted themselves into this project. Again, however, the take away here is not how hard it would be to do the formal verification for the OS kernel but shows that it is possible to do the formal verification. The whole process of the formal verification, the language and the tools developed along the way, would provide us some insightful ideas for what to be done next.