3,10 → 3,10 |
|
\title{A Road to a Formally Verified General-Purpose Operating System} |
\author{Martin D\v{e}ck\'{y}} |
\institute{Department of Software Engineering\\ |
\institute{Department of Distributed and Dependable Systems\\ |
Faculty of Mathematics and Physics, Charles University\\ |
Malostransk\'{e} n\'{a}m\v{e}st\'{i} 25, Prague 1, 118~00, Czech Republic\\ |
\email{martin.decky@dsrg.mff.cuni.cz}} |
\email{martin.decky@d3s.mff.cuni.cz}} |
|
\begin{document} |
\maketitle |
15,119 → 15,105 |
Methods of formal description and verification represent a viable way for achieving |
fundamentally bug-free software. However, in reality only a small subset of the existing operating |
systems were ever formally verified, despite the fact that an operating system is a critical part |
of almost any other software system. This paper summarizes the challenges involved in formal verification |
of operating systems, points out several key design choices which should make the formal verification |
easier and presents a work-in-progress and initial experiences with formal verification of HelenOS, |
a state-of-the-art microkernel-based operating system, which, however, was not designed specifically |
with formal verification in mind, as this is mostly prohibitive due to time and budget constrains. |
of almost any other software system. This paper points out several key design choices which |
should make the formal verification of an operating system easier and presents a work-in-progress |
and initial experiences with formal verification of HelenOS, a state-of-the-art microkernel-based |
operating system, which, however, was not designed specifically with formal verification in mind, |
as this is mostly prohibitive due to time and budget constrains. |
|
The contribution of this paper is the shift of focus from attempts to use a single ``silver-bullet'' |
formal verification method which would be able to verify everything to a combination of multiple |
formalisms and techniques to cover various aspects of the verification process. The formally verified |
operating system is the emerging property of the combination. |
formalisms and techniques to successfully cover various aspects of the operating system. |
A reliable and dependable operating system is the emerging property of the combination, |
thanks to the suitable architecture of the operating system. |
\end{abstract} |
|
\section{Introduction} |
\label{introduction} |
In the context of formal verification of software, it is always important to model the outer |
environment with a proper level of abstraction. Weak assumptions on the environment make the formal |
verification less feasible, because they yield many degrees of freedom which render |
the properties which we want to verify overly complex. But strong assumptions on the environment are likewise |
not without a price: Any practical usability of the verification results depends on the question |
whether we are really able to create a physical environment which satisfies the assumptions. |
Operating systems (OSes for short) have a somewhat special position among all software. |
OSes are usually designed to run on bare hardware. This means that they do not require |
any special assumptions on the environment except the assumptions on the properties and |
behavior of hardware. In many cases it is perfectly valid to consider the hardware |
as \emph{idealized hardware} (zero mathematical probability of failure, perfect compiance |
with the specifications, etc.). This means that it is solely the OS that defines the |
environment for other software. |
|
There are some assumptions which are universal for any formal verification method. For the sake of |
simplicity let us call them \emph{idealized hardware}. Let us also cover the computational model |
and basic mathematical principles by this term. |
OSes represent the lowest software layer and provide services to essentially all other |
software. Considering the principle of recursion, the properties of an OS form the |
assumptions for the upper layers of software. Thus the dependability of end-user and |
enterprise software systems is always limited by the dependability of the OS. |
|
What is \emph{idealized hardware}? We usually assume that the software is executed by a ``perfect'' hardware |
which works exactly according to its specification. Every divergence between the intended and actual |
behavior of software counts as a software bug and never as an issue in some hardware component. We |
implicitly assume that the hardware is perfect in the sense of physical realization and all probabilities |
of accidental mechanical and electrical failures are expected to be zero. |
Finally, OSes are non-trivial software on their own. The way they are generally designed |
and programmed (spanning both the kernel and user mode, manipulating execution contexts |
and concurrency, handling critical hardware-related operations) represent significant |
and interesting challenges for software analysis. |
|
It is trivial to see that the \emph{idealized hardware} assumptions can never hold in real world. |
Accidental mechanical and electrical failures are inevitable in principle and their probability can |
never be a mathematical zero. Does this mean that any formal verification method is ultimately futile |
since we are unable to ensure validity of the most elementary assumptions? Probably not. We just need |
to be aware about the inherent and fundamental limitations of the formal methods which result from |
the shortcomings of the physical world. |
|
The more precisely we model the software environment of the software under discussion, the more precisely |
we can calculate or estimate the resulting impact of the imperfections of the physical world on our software |
system. This is the most important motivation for formal reasoning about the correctness of the operating system. |
|
\medskip |
|
In more detail, operating systems have a somewhat special position among all software: |
These are probably the most important reasons that led to several research initiatives |
in the recent years which target the creation of a formally verified OSes from scratch |
(e.g. \cite{seL4}). Methods of formal description and verification provide fundamentally |
better guarantees of desirable properties than non-exhaustive engineering methods such |
as testing. |
|
\begin{itemize} |
\item \emph{Operating systems are usually designed to run on bare hardware.} This means that except |
the \emph{idealized hardware} assumptions we do not have to take any or almost any extra assumptions |
into account. |
\item \emph{Operating systems create the lowest software layer and provide services to essentially |
all other software.} Considering the principle of recursion, the properties of the |
operating systems which we prove or disprove form the assumptions for the upper layers |
of software. Thus the dependability of end-user and enterprise software systems is limited |
by the dependability of the operating system. |
\item \emph{Operating systems are non-trivial software on their own.} The way |
they are generally designed and programmed (spanning both the kernel and user mode, |
manipulating execution contexts and concurrency, handling critical hardware-related |
operations) represent significant and interesting challenges for the formal verification |
methods and tools. |
\end{itemize} |
However, 98~\%\footnote{98~\% of client computers connected to the Internet as of January |
2010~\cite{marketshare}.} of the market share of general-purpose OSes is taken |
by Windows, Mac~OS~X and Linux. These systems were clearly not designed with formal |
verification in mind from the very beginning. The situation on the embedded, real-time |
and special-purpose OSes market is probably different, but it is unlikely that the |
segmentation of the desktop and server OSes market is going to change very rapidly |
in the near future. |
|
Even in the informal understanding, the dependability of an operating system greatly determines the perceived |
dependability of the entire software stack. This led to several research initiatives in the recent years |
which target the creation of a formally verified operating systems from scratch~\cite{seL4}. |
The architecture of these major desktop and server OSes is monolithic, which makes |
any attempts to do formal verification on them extremely challenging due to the large |
state space. Fortunatelly we can observe that aspects of several novel approaches from |
the OS research from the late 1980s and early 1990s (microkernel design, user space |
file system and device drivers, etc.) are slowly emerging in these originally purely |
monolithic implementations. |
|
However, 98~\%\footnote{98~\% of client computers connected to the Internet as of January 2010~\cite{marketshare}.} of |
the market share of general-purpose operating systems is taken by Windows, Mac~OS~X and Linux. |
These systems were clearly not designed with formal verification in mind from the very beginning. |
The situation on the embedded, real-time and special-purpose operating systems market is probably different, |
but it is unlikely that the situation in the large domain of desktops and servers is going to change |
very rapidly in the near future. |
\medskip |
|
Therefore we need to face the challenges of applying formal methods on an existing code base in the domain |
of general-purpose operating systems. Fortunately, the software design qualities of the general-purpose |
operating systems gradually improve over time. We can see then novel approaches in the operating systems |
research from the late 1980s and 1990s (microkernel design, user space file system and device drivers, etc.) |
to slowly emerge in the originally monolithic operating systems. We can also see better code quality thanks |
to improved software engineering (code review, proper escalation management, etc.). |
In this paper we show how specific design choices can markedly improve the feasibility |
of verification of an OS, even if the particular OS was not designed |
specifically with formal verification in mind. These design choices can be gradually |
introduced (and in fact some of them have already been introduced) to mainstream |
general-purpose OSes. |
|
\medskip |
Our approach is not based on using a single ``silver-bullet'' formalism, methodology or |
tool, but on combining various enginering, semi-formal and formal approaches. |
While the lesser formal approaches give lesser guarantees, they can complement |
the formal approaches on their boundaries and increase the coverage of the set of |
all hypothetical interesting properties of the system. |
|
This paper proposes an approach and presents a work-in-progress case study of formal verification |
of an general-purpose research operating system, which was also not created specifically with formal |
verification in mind from the very beginning, but it was designed according to state-of-the-art operating systems |
principles. |
We also demonstrate work-in-progress case study of an general-purpose research OS |
that was not created specifically with formal verification in mind from the very |
beginning, but that was designed according to state-of-the-art OS principles. |
|
\medskip |
|
\noindent\textbf{Structure of the Paper.} In Section \ref{context} we introduce the case study in more detail and explain |
why we believe it is relevant. In Section \ref{analysis} we discuss our proposal of the combination of formal methods |
and tools. In Section \ref{evaluation} we present the preliminary evaluation of our efforts and estimate the complexity |
of the imminent next steps we want to take according to our proposal. Finally, in Section \ref{conclusion} |
we present the conclusion of the paper. |
\noindent\textbf{Structure of the Paper.} In Section \ref{design} we introduce |
the design choices and our case study in more detail. In Section \ref{analysis} we |
discuss our approach of the combination of methods and tools. In Section \ref{evaluation} |
we present a preliminary evaluation of our efforts and propose the imminent next steps |
to take. Finally, in Section \ref{conclusion} we present the conclusion of the paper. |
|
\section{Context} |
\label{context} |
Two very common schemes of operating system design are \emph{monolithic design} and \emph{microkernel design}. |
Without going into much detail of any specific operating system, we can define the monolithic design as |
a preference to put numerous aspects of the core operating system functionality into the kernel, |
while microkernel design is a preference to keep the kernel small, with just a minimal set of features. |
\section{Operating Systems Design} |
\label{design} |
Two very common schemes of OS design are \emph{monolithic design} and \emph{microkernel design}. |
Without going into much detail of any specific implementation, we can define the monolithic design as |
a preference to put numerous aspects of the core OS functionality into the kernel, while microkernel |
design is a preference to keep the kernel small, with just a minimal set of features. |
|
The features which are missing from the kernel in the microkernel design are implemented in user space, usually |
by means of libraries, servers and/or software components. |
The features which are missing from the kernel in the microkernel design are implemented |
in user space, usually by means of libraries, servers (e.g. processes/tasks) and/or software components. |
|
\subsection{HelenOS} |
\label{helenos} |
\emph{HelenOS} is a general-purpose research operating system which is being developed at Charles |
\emph{HelenOS} is a general-purpose research OS which is being developed at Charles |
University in Prague. The source code is available under the BSD open source license and can be |
freely downloaded from the project web site~\cite{helenos}. The authors of the code base are |
both from the academia and from the open source community (several contributors are employed |
as Solaris kernel developers and many are freelance IT professionals). We consistently strive to support |
the research and also the practical motivations for developing the system. |
as Solaris kernel developers and many are freelance IT professionals). |
|
HelenOS uses a preemptive priority-feedback scheduler, it supports SMP hardware and it is |
designed to be highly portable. Currently it runs on 7 distinct hardware architectures, including the |
139,80 → 125,73 |
no scientific value), the essential foundations such as file system support and TCP/IP networking |
are already in place. |
|
HelenOS does not currently target embedded devices (although the ARMv7 port can be easily modified to |
run on various embedded boards) and does not implement real-time scheduling and synchronization. |
Many development projects such as task snapshoting and migration, generic device driver |
framework, support for MMU-less platforms and performance monitoring are currently underway. |
HelenOS does not currently target embedded devices (although the ARMv7 port can be very easily |
modified to run on various embedded boards) and does not implement real-time features. |
Many development projects such as task snapshoting and migration, support for MMU-less |
platforms and performance monitoring are currently underway. |
|
\medskip |
|
HelenOS has a microkernel multiserver design, but the guiding principles of the HelenOS design are |
actually more elaborate: |
HelenOS can be briefly described as microkernel multiserver design. However, the actual design |
guiding principles of the HelenOS are more elaborate: |
|
\begin{itemize} |
\item \emph{(Microkernel principle)} Every functionality of the operating system that does not |
\begin{description} |
\item[Microkernel principle] Every functionality of the OS that does not |
have to be necessary implemented in the kernel should be implemented in user space. This |
implies that subsystems such as the file system, device drivers (except those which are |
essential for the basic kernel functionality), naming and trading services, networking, |
human interface and similar features should be implemented in user space. |
\item \emph{(Full-fledged principle)} Features which need to be implemented in kernel should |
be designed with full-fledged algorithms and data structures. In contrast |
to several other microkernel operating systems, where the authors have deliberately chosen |
\item[Full-fledged principle] Features which need to be placed in kernel should |
be implemented by full-fledged algorithms and data structures. In contrast |
to several other microkernel OSes, where the authors have deliberately chosen |
the most simplistic approach (static memory allocation, na\"{\i}ve algorithms, simple data |
structures), HelenOS microkernel tries to use the most advanced and suitable means available. |
It contains features such as AVL and B+ trees, hashing tables, SLAB memory allocator, multiple |
in-kernel synchronization primitives, fine-grained locking and so on. |
\item \emph{(Multiserver principle)} Subsystems in user space should be decomposed with the smallest |
\item[Multiserver principle] Subsystems in user space should be decomposed with the smallest |
reasonable granularity. Each unit of decomposition should be encapsulated in a separate task. |
The tasks represent software components with isolated address spaces. From the design point of |
view the kernel can be seen as a separate component, too. |
\item \emph{(Split of mechanism and policy)} The kernel should only provide low-level mechanisms, |
\item[Split of mechanism and policy] The kernel should only provide low-level mechanisms, |
while the high-level policies which are built upon these mechanisms should be defined in |
user space. This also implies that the policies should be easily replaceable while keeping |
the low-level mechanisms intact. |
\item \emph{(Encapsulation principle)} The communication between the tasks/components should be |
\item[Encapsulation principle] The communication between the tasks/components should be |
implemented only via a set of well-defined interfaces. In the user-to-user case the preferred |
communication mechanism is HelenOS IPC, which provides reasonable mix of abstraction and |
performance (RPC-like primitives combined with implicit memory sharing for large data |
transfers). In case of synchronous user-to-kernel communication the usual syscalls are used. |
HelenOS IPC is used again for asynchronous kernel-to-user communication. |
\item \emph{(Portability principle)} The design and implementation should always maintain a high |
\item[Portability principle] The design and implementation should always maintain a high |
level of platform neutrality and portability. Platform-specific code in the kernel, core |
libraries and tasks implementing low-level functionality (e.g. device drivers) should be |
clearly separated from the generic code (either by component decomposition, naming and trading, |
or at least by internal compile-time APIs). |
\end{itemize} |
libraries and tasks implementing device drivers should be clearly separated from the |
generic code (either by component decomposition or at least by internal compile-time APIs). |
\end{description} |
|
As these design guiding principles suggest, the size of the HelenOS microkernel is considerably larger |
compared to ``scrupulous'' microkernel implementations. The average footprint of the kernel ranges from |
569~KiB when all logging messages, asserts, symbol resolution and debugging kernel console are compiled |
in, down to 198~KiB for a non-debugging production build. We do not believe that the raw size |
of the microkernel is a relevant quality criterion per se, without taking the actual feature set |
into account. |
In Section \ref{analysis} we argue that several of these design principles significantly improve |
the feasibility of formal verification of the entire system. On the other hand, other design principles |
induce new interesting challenges for formal description and verification. |
|
\medskip |
|
The run-time architecture of HelenOS is inherently dynamic. The bindings between the components are |
not created at compile-time, but during the bootstrap process and can be modified to a large degree |
also during normal operation mode of the system (via human interaction and external events). This |
creates particularly interesting challenges for describing the design of the system by many formalisms. |
not created at compile-time, but during bootstrap and can be modified to a large degree also during |
normal operation mode of the system (via human interaction and external events). |
|
\medskip |
The design of the ubiquitous HelenOS IPC mechanism and the associated threading model present |
the possibility to significantly reduce the size of the state space which needs to be explored |
by formal verification tools, but at the same time it is quite hard to express these |
constrains in many formalisms. The IPC is inherently asynchronous with constant message buffers |
in the kernel and dynamic buffers in user space. It uses the notions of uni-directional bindings, |
mandatory pairing of requests and replies, binding establishment and abolishment handshakes, |
memory sharing and fast message forwarding. |
|
Yet another set of obstacles for reasoning about the properties of HelenOS lies in the design of |
the ubiquitous HelenOS IPC mechanism and the associated threading model. The IPC is inherently |
asynchronous with constant message buffers in the kernel and dynamic buffers in user space. |
It uses the notions of uni-directional bindings, mandatory pairing of requests and replies, |
binding establishment and abolishment handshakes, memory sharing and fast message forwarding. |
|
For easier management of the asynchronous messages and the possibility to process multiple |
messages from different peers without the usual kernel threading overhead, the core user space |
library manages the execution flow by so-called \emph{fibrils}. A fibril is a user-space-managed thread with |
cooperative scheduling. A different fibril is scheduled every time the current fibril is |
about to be blocked while sending out IPC requests (because the kernel buffers of the addressee |
library manages the execution flow by so-called \emph{fibrils}. A fibril is a user-space-managed |
thread with cooperative scheduling. A different fibril is scheduled every time the current fibril |
is about to be blocked while sending out IPC requests (because the kernel buffers of the addressee |
are full) or while waiting on an IPC reply. This allows different execution flows within the |
same thread to process multiple requests and replies. To safeguard proper sequencing |
of IPC messages and provide synchronization, special fibril-aware synchronization primitives |
same thread to process multiple requests and replies. To safeguard proper sequencing of IPC |
messages and provide synchronization, special fibril-aware synchronization primitives |
(mutexes, condition variables, etc.) are available. |
|
Because of the cooperative nature of fibrils, they might cause severe performance under-utilization |
224,15 → 203,24 |
|
\medskip |
|
Incidentally, the \emph{full-fledged principle} causes that the size of the HelenOS microkernel is |
considerably larger compared to other ``scrupulous'' microkernel implementations. The average |
footprint of the kernel on IA-32 ranges from 569~KiB when all logging messages, asserts, symbol |
resolution and the debugging kernel console are compiled in, down to 198~KiB for a non-debugging |
production build. But we do not believe that the raw size of the microkernel is a relevant quality |
criterion per se, without taking the actual feature set into account. |
|
\medskip |
|
To sum up, the choice of HelenOS as our case study is based on the fact that it was not designed |
in advance with formal verification in mind. This is similar to most general-purpose operating |
systems in common use. At the same time, it does not have an obsolete design and is non-trivial. |
in advance with formal verification in mind (some of the design principles might be beneficial, |
but others might be disadvantageous), but the design of HelenOS is also non-trivial and not obsolete. |
|
\subsection{The C Programming Language} |
A large majority of operating systems is coded in the C programming language. HelenOS is no exception |
to this. The choice of C in the case of kernel is usually well-motivated -- the language was designed |
specifically for implementing system software~\cite{c}. It is reasonably low-level in the sense that it allows |
to access the memory and other hardware resources with similar effectiveness as from assembler. |
A large majority of OSes is coded in the C programming language (HelenOS is no exception |
to this). The choice of C in the case of kernel is usually well-motivated, since the C language was designed |
specifically for implementing system software~\cite{c}: It is reasonably low-level in the sense that it allows |
to access the memory and other hardware resources with similar effectiveness as from assembler; |
It also requires almost no run-time support and it exports many features of the von Neumann hardware |
architecture to the programmer in a very straightforward, but still relatively portable way. |
|
239,7 → 227,7 |
However, what is the biggest advantage of C in terms of run-time performance is also the biggest weakness |
for formal reasoning. The permissive memory access model of C, the lack of any reference safety |
enforcement, the weak type system and generally little semantic information in the code -- all these |
properties do not allow to make many general assumptions about the code. |
properties that do not allow to make many general assumptions about the code. |
|
Programming languages which target controlled environments such as Java or C\(\sharp\) are |
generally easier for formal reasoning because they provide a well-known set of primitives |
250,12 → 238,12 |
|
\medskip |
|
The reasons for using C in the user space of HelenOS (and other established operating systems) is |
probably much more questionable. Except for the core libraries and services (such as device drivers), |
C might be easily replaced by any high-level and perhaps even non-imperative programming language. |
The reasons for using C in this context is mostly historical. |
The reasons for frequent use of C in the user space of many established OSes (and HelenOS) is |
probably much more questionable. In the case of HelenOS, except for the core libraries and tasks |
(such as device drivers), C might be easily replaced by any high-level and perhaps even |
non-imperative programming language. The reasons for using C in this context are mostly historical. |
|
However, as we have stated in Section \ref{introduction}, the way general-purpose operating systems |
However, as we have stated in Section \ref{introduction}, the way general-purpose OSes |
are implemented changes only slowly and therefore any propositions which require radical modification |
of the existing code base before committing to the formal verification are not realistic. |
|
274,13 → 262,27 |
operating system. Each set of properties usually requires a specific verification method, |
tool or toolchain. |
|
Our approach will be mostly bottom-to-top, or, in other words, from the lower levels of abstraction |
Our approach will be mostly bottom-up, or, in other words, from the lower levels of abstraction |
to the higher levels of abstraction. If the verification fails on a lower level, it usually |
does not make much sense to continue with the higher levels of abstraction until the issues |
are tackled. A structured overview of the formalisms and tools can be seen on Figure \ref{fig:diag}. |
are tackled. A structured overview of the formalisms, methods and tools can be seen on |
Figure \ref{fig:diag}. |
|
\medskip |
|
Some of the proposed methods cannot be called ``formal methods'' in the rigorous understanding |
of the term. However, even methods which are based on semi-formal reasoning and non-exhaustive |
testing provide some limited guarantees in their specific context. A valued property |
of the formal methods is to preserve these limited guarantees even on the higher levels |
of abstraction, thus complementing the formal guarantees where the formal methods do not provide |
any feasible verification so far. This increases the coverage of the set of all hypothetical |
interesting properties of the system (although it is probably impossible to formally define |
this set). |
|
\medskip |
|
Please note that the titles of the following sections do not follow any particular established |
taxonomy of verification methods. We have simply chosen the names to be descriptive. |
taxonomy. We have simply chosen the names to be intuitivelly descriptive. |
|
\subsection{C Language Compiler and Continuous Integration Tool} |
\label{clang} |