Subversion Repositories HelenOS-doc

Compare Revisions

Ignore whitespace Rev 179 → Rev 180

/papers/isarcs10/isarcs10.tex
3,10 → 3,10
 
\title{A Road to a Formally Verified General-Purpose Operating System}
\author{Martin D\v{e}ck\'{y}}
\institute{Department of Software Engineering\\
\institute{Department of Distributed and Dependable Systems\\
Faculty of Mathematics and Physics, Charles University\\
Malostransk\'{e} n\'{a}m\v{e}st\'{i} 25, Prague 1, 118~00, Czech Republic\\
\email{martin.decky@dsrg.mff.cuni.cz}}
\email{martin.decky@d3s.mff.cuni.cz}}
 
\begin{document}
\maketitle
15,119 → 15,105
Methods of formal description and verification represent a viable way for achieving
fundamentally bug-free software. However, in reality only a small subset of the existing operating
systems were ever formally verified, despite the fact that an operating system is a critical part
of almost any other software system. This paper summarizes the challenges involved in formal verification
of operating systems, points out several key design choices which should make the formal verification
easier and presents a work-in-progress and initial experiences with formal verification of HelenOS,
a state-of-the-art microkernel-based operating system, which, however, was not designed specifically
with formal verification in mind, as this is mostly prohibitive due to time and budget constrains.
of almost any other software system. This paper points out several key design choices which
should make the formal verification of an operating system easier and presents a work-in-progress
and initial experiences with formal verification of HelenOS, a state-of-the-art microkernel-based
operating system, which, however, was not designed specifically with formal verification in mind,
as this is mostly prohibitive due to time and budget constrains.
The contribution of this paper is the shift of focus from attempts to use a single ``silver-bullet''
formal verification method which would be able to verify everything to a combination of multiple
formalisms and techniques to cover various aspects of the verification process. The formally verified
operating system is the emerging property of the combination.
formalisms and techniques to successfully cover various aspects of the operating system.
A reliable and dependable operating system is the emerging property of the combination,
thanks to the suitable architecture of the operating system.
\end{abstract}
\section{Introduction}
\label{introduction}
In the context of formal verification of software, it is always important to model the outer
environment with a proper level of abstraction. Weak assumptions on the environment make the formal
verification less feasible, because they yield many degrees of freedom which render
the properties which we want to verify overly complex. But strong assumptions on the environment are likewise
not without a price: Any practical usability of the verification results depends on the question
whether we are really able to create a physical environment which satisfies the assumptions.
Operating systems (OSes for short) have a somewhat special position among all software.
OSes are usually designed to run on bare hardware. This means that they do not require
any special assumptions on the environment except the assumptions on the properties and
behavior of hardware. In many cases it is perfectly valid to consider the hardware
as \emph{idealized hardware} (zero mathematical probability of failure, perfect compiance
with the specifications, etc.). This means that it is solely the OS that defines the
environment for other software.
There are some assumptions which are universal for any formal verification method. For the sake of
simplicity let us call them \emph{idealized hardware}. Let us also cover the computational model
and basic mathematical principles by this term.
OSes represent the lowest software layer and provide services to essentially all other
software. Considering the principle of recursion, the properties of an OS form the
assumptions for the upper layers of software. Thus the dependability of end-user and
enterprise software systems is always limited by the dependability of the OS.
What is \emph{idealized hardware}? We usually assume that the software is executed by a ``perfect'' hardware
which works exactly according to its specification. Every divergence between the intended and actual
behavior of software counts as a software bug and never as an issue in some hardware component. We
implicitly assume that the hardware is perfect in the sense of physical realization and all probabilities
of accidental mechanical and electrical failures are expected to be zero.
Finally, OSes are non-trivial software on their own. The way they are generally designed
and programmed (spanning both the kernel and user mode, manipulating execution contexts
and concurrency, handling critical hardware-related operations) represent significant
and interesting challenges for software analysis.
It is trivial to see that the \emph{idealized hardware} assumptions can never hold in real world.
Accidental mechanical and electrical failures are inevitable in principle and their probability can
never be a mathematical zero. Does this mean that any formal verification method is ultimately futile
since we are unable to ensure validity of the most elementary assumptions? Probably not. We just need
to be aware about the inherent and fundamental limitations of the formal methods which result from
the shortcomings of the physical world.
The more precisely we model the software environment of the software under discussion, the more precisely
we can calculate or estimate the resulting impact of the imperfections of the physical world on our software
system. This is the most important motivation for formal reasoning about the correctness of the operating system.
\medskip
In more detail, operating systems have a somewhat special position among all software:
These are probably the most important reasons that led to several research initiatives
in the recent years which target the creation of a formally verified OSes from scratch
(e.g. \cite{seL4}). Methods of formal description and verification provide fundamentally
better guarantees of desirable properties than non-exhaustive engineering methods such
as testing.
\begin{itemize}
\item \emph{Operating systems are usually designed to run on bare hardware.} This means that except
the \emph{idealized hardware} assumptions we do not have to take any or almost any extra assumptions
into account.
\item \emph{Operating systems create the lowest software layer and provide services to essentially
all other software.} Considering the principle of recursion, the properties of the
operating systems which we prove or disprove form the assumptions for the upper layers
of software. Thus the dependability of end-user and enterprise software systems is limited
by the dependability of the operating system.
\item \emph{Operating systems are non-trivial software on their own.} The way
they are generally designed and programmed (spanning both the kernel and user mode,
manipulating execution contexts and concurrency, handling critical hardware-related
operations) represent significant and interesting challenges for the formal verification
methods and tools.
\end{itemize}
However, 98~\%\footnote{98~\% of client computers connected to the Internet as of January
2010~\cite{marketshare}.} of the market share of general-purpose OSes is taken
by Windows, Mac~OS~X and Linux. These systems were clearly not designed with formal
verification in mind from the very beginning. The situation on the embedded, real-time
and special-purpose OSes market is probably different, but it is unlikely that the
segmentation of the desktop and server OSes market is going to change very rapidly
in the near future.
Even in the informal understanding, the dependability of an operating system greatly determines the perceived
dependability of the entire software stack. This led to several research initiatives in the recent years
which target the creation of a formally verified operating systems from scratch~\cite{seL4}.
The architecture of these major desktop and server OSes is monolithic, which makes
any attempts to do formal verification on them extremely challenging due to the large
state space. Fortunatelly we can observe that aspects of several novel approaches from
the OS research from the late 1980s and early 1990s (microkernel design, user space
file system and device drivers, etc.) are slowly emerging in these originally purely
monolithic implementations.
However, 98~\%\footnote{98~\% of client computers connected to the Internet as of January 2010~\cite{marketshare}.} of
the market share of general-purpose operating systems is taken by Windows, Mac~OS~X and Linux.
These systems were clearly not designed with formal verification in mind from the very beginning.
The situation on the embedded, real-time and special-purpose operating systems market is probably different,
but it is unlikely that the situation in the large domain of desktops and servers is going to change
very rapidly in the near future.
\medskip
Therefore we need to face the challenges of applying formal methods on an existing code base in the domain
of general-purpose operating systems. Fortunately, the software design qualities of the general-purpose
operating systems gradually improve over time. We can see then novel approaches in the operating systems
research from the late 1980s and 1990s (microkernel design, user space file system and device drivers, etc.)
to slowly emerge in the originally monolithic operating systems. We can also see better code quality thanks
to improved software engineering (code review, proper escalation management, etc.).
In this paper we show how specific design choices can markedly improve the feasibility
of verification of an OS, even if the particular OS was not designed
specifically with formal verification in mind. These design choices can be gradually
introduced (and in fact some of them have already been introduced) to mainstream
general-purpose OSes.
\medskip
Our approach is not based on using a single ``silver-bullet'' formalism, methodology or
tool, but on combining various enginering, semi-formal and formal approaches.
While the lesser formal approaches give lesser guarantees, they can complement
the formal approaches on their boundaries and increase the coverage of the set of
all hypothetical interesting properties of the system.
This paper proposes an approach and presents a work-in-progress case study of formal verification
of an general-purpose research operating system, which was also not created specifically with formal
verification in mind from the very beginning, but it was designed according to state-of-the-art operating systems
principles.
We also demonstrate work-in-progress case study of an general-purpose research OS
that was not created specifically with formal verification in mind from the very
beginning, but that was designed according to state-of-the-art OS principles.
\medskip
\noindent\textbf{Structure of the Paper.} In Section \ref{context} we introduce the case study in more detail and explain
why we believe it is relevant. In Section \ref{analysis} we discuss our proposal of the combination of formal methods
and tools. In Section \ref{evaluation} we present the preliminary evaluation of our efforts and estimate the complexity
of the imminent next steps we want to take according to our proposal. Finally, in Section \ref{conclusion}
we present the conclusion of the paper.
\noindent\textbf{Structure of the Paper.} In Section \ref{design} we introduce
the design choices and our case study in more detail. In Section \ref{analysis} we
discuss our approach of the combination of methods and tools. In Section \ref{evaluation}
we present a preliminary evaluation of our efforts and propose the imminent next steps
to take. Finally, in Section \ref{conclusion} we present the conclusion of the paper.
\section{Context}
\label{context}
Two very common schemes of operating system design are \emph{monolithic design} and \emph{microkernel design}.
Without going into much detail of any specific operating system, we can define the monolithic design as
a preference to put numerous aspects of the core operating system functionality into the kernel,
while microkernel design is a preference to keep the kernel small, with just a minimal set of features.
\section{Operating Systems Design}
\label{design}
Two very common schemes of OS design are \emph{monolithic design} and \emph{microkernel design}.
Without going into much detail of any specific implementation, we can define the monolithic design as
a preference to put numerous aspects of the core OS functionality into the kernel, while microkernel
design is a preference to keep the kernel small, with just a minimal set of features.
The features which are missing from the kernel in the microkernel design are implemented in user space, usually
by means of libraries, servers and/or software components.
The features which are missing from the kernel in the microkernel design are implemented
in user space, usually by means of libraries, servers (e.g. processes/tasks) and/or software components.
\subsection{HelenOS}
\label{helenos}
\emph{HelenOS} is a general-purpose research operating system which is being developed at Charles
\emph{HelenOS} is a general-purpose research OS which is being developed at Charles
University in Prague. The source code is available under the BSD open source license and can be
freely downloaded from the project web site~\cite{helenos}. The authors of the code base are
both from the academia and from the open source community (several contributors are employed
as Solaris kernel developers and many are freelance IT professionals). We consistently strive to support
the research and also the practical motivations for developing the system.
as Solaris kernel developers and many are freelance IT professionals).
HelenOS uses a preemptive priority-feedback scheduler, it supports SMP hardware and it is
designed to be highly portable. Currently it runs on 7 distinct hardware architectures, including the
139,80 → 125,73
no scientific value), the essential foundations such as file system support and TCP/IP networking
are already in place.
HelenOS does not currently target embedded devices (although the ARMv7 port can be easily modified to
run on various embedded boards) and does not implement real-time scheduling and synchronization.
Many development projects such as task snapshoting and migration, generic device driver
framework, support for MMU-less platforms and performance monitoring are currently underway.
HelenOS does not currently target embedded devices (although the ARMv7 port can be very easily
modified to run on various embedded boards) and does not implement real-time features.
Many development projects such as task snapshoting and migration, support for MMU-less
platforms and performance monitoring are currently underway.
\medskip
HelenOS has a microkernel multiserver design, but the guiding principles of the HelenOS design are
actually more elaborate:
HelenOS can be briefly described as microkernel multiserver design. However, the actual design
guiding principles of the HelenOS are more elaborate:
\begin{itemize}
\item \emph{(Microkernel principle)} Every functionality of the operating system that does not
\begin{description}
\item[Microkernel principle] Every functionality of the OS that does not
have to be necessary implemented in the kernel should be implemented in user space. This
implies that subsystems such as the file system, device drivers (except those which are
essential for the basic kernel functionality), naming and trading services, networking,
human interface and similar features should be implemented in user space.
\item \emph{(Full-fledged principle)} Features which need to be implemented in kernel should
be designed with full-fledged algorithms and data structures. In contrast
to several other microkernel operating systems, where the authors have deliberately chosen
\item[Full-fledged principle] Features which need to be placed in kernel should
be implemented by full-fledged algorithms and data structures. In contrast
to several other microkernel OSes, where the authors have deliberately chosen
the most simplistic approach (static memory allocation, na\"{\i}ve algorithms, simple data
structures), HelenOS microkernel tries to use the most advanced and suitable means available.
It contains features such as AVL and B+ trees, hashing tables, SLAB memory allocator, multiple
in-kernel synchronization primitives, fine-grained locking and so on.
\item \emph{(Multiserver principle)} Subsystems in user space should be decomposed with the smallest
\item[Multiserver principle] Subsystems in user space should be decomposed with the smallest
reasonable granularity. Each unit of decomposition should be encapsulated in a separate task.
The tasks represent software components with isolated address spaces. From the design point of
view the kernel can be seen as a separate component, too.
\item \emph{(Split of mechanism and policy)} The kernel should only provide low-level mechanisms,
\item[Split of mechanism and policy] The kernel should only provide low-level mechanisms,
while the high-level policies which are built upon these mechanisms should be defined in
user space. This also implies that the policies should be easily replaceable while keeping
the low-level mechanisms intact.
\item \emph{(Encapsulation principle)} The communication between the tasks/components should be
\item[Encapsulation principle] The communication between the tasks/components should be
implemented only via a set of well-defined interfaces. In the user-to-user case the preferred
communication mechanism is HelenOS IPC, which provides reasonable mix of abstraction and
performance (RPC-like primitives combined with implicit memory sharing for large data
transfers). In case of synchronous user-to-kernel communication the usual syscalls are used.
HelenOS IPC is used again for asynchronous kernel-to-user communication.
\item \emph{(Portability principle)} The design and implementation should always maintain a high
\item[Portability principle] The design and implementation should always maintain a high
level of platform neutrality and portability. Platform-specific code in the kernel, core
libraries and tasks implementing low-level functionality (e.g. device drivers) should be
clearly separated from the generic code (either by component decomposition, naming and trading,
or at least by internal compile-time APIs).
\end{itemize}
libraries and tasks implementing device drivers should be clearly separated from the
generic code (either by component decomposition or at least by internal compile-time APIs).
\end{description}
As these design guiding principles suggest, the size of the HelenOS microkernel is considerably larger
compared to ``scrupulous'' microkernel implementations. The average footprint of the kernel ranges from
569~KiB when all logging messages, asserts, symbol resolution and debugging kernel console are compiled
in, down to 198~KiB for a non-debugging production build. We do not believe that the raw size
of the microkernel is a relevant quality criterion per se, without taking the actual feature set
into account.
In Section \ref{analysis} we argue that several of these design principles significantly improve
the feasibility of formal verification of the entire system. On the other hand, other design principles
induce new interesting challenges for formal description and verification.
\medskip
The run-time architecture of HelenOS is inherently dynamic. The bindings between the components are
not created at compile-time, but during the bootstrap process and can be modified to a large degree
also during normal operation mode of the system (via human interaction and external events). This
creates particularly interesting challenges for describing the design of the system by many formalisms.
not created at compile-time, but during bootstrap and can be modified to a large degree also during
normal operation mode of the system (via human interaction and external events).
\medskip
The design of the ubiquitous HelenOS IPC mechanism and the associated threading model present
the possibility to significantly reduce the size of the state space which needs to be explored
by formal verification tools, but at the same time it is quite hard to express these
constrains in many formalisms. The IPC is inherently asynchronous with constant message buffers
in the kernel and dynamic buffers in user space. It uses the notions of uni-directional bindings,
mandatory pairing of requests and replies, binding establishment and abolishment handshakes,
memory sharing and fast message forwarding.
Yet another set of obstacles for reasoning about the properties of HelenOS lies in the design of
the ubiquitous HelenOS IPC mechanism and the associated threading model. The IPC is inherently
asynchronous with constant message buffers in the kernel and dynamic buffers in user space.
It uses the notions of uni-directional bindings, mandatory pairing of requests and replies,
binding establishment and abolishment handshakes, memory sharing and fast message forwarding.
For easier management of the asynchronous messages and the possibility to process multiple
messages from different peers without the usual kernel threading overhead, the core user space
library manages the execution flow by so-called \emph{fibrils}. A fibril is a user-space-managed thread with
cooperative scheduling. A different fibril is scheduled every time the current fibril is
about to be blocked while sending out IPC requests (because the kernel buffers of the addressee
library manages the execution flow by so-called \emph{fibrils}. A fibril is a user-space-managed
thread with cooperative scheduling. A different fibril is scheduled every time the current fibril
is about to be blocked while sending out IPC requests (because the kernel buffers of the addressee
are full) or while waiting on an IPC reply. This allows different execution flows within the
same thread to process multiple requests and replies. To safeguard proper sequencing
of IPC messages and provide synchronization, special fibril-aware synchronization primitives
same thread to process multiple requests and replies. To safeguard proper sequencing of IPC
messages and provide synchronization, special fibril-aware synchronization primitives
(mutexes, condition variables, etc.) are available.
Because of the cooperative nature of fibrils, they might cause severe performance under-utilization
224,15 → 203,24
\medskip
Incidentally, the \emph{full-fledged principle} causes that the size of the HelenOS microkernel is
considerably larger compared to other ``scrupulous'' microkernel implementations. The average
footprint of the kernel on IA-32 ranges from 569~KiB when all logging messages, asserts, symbol
resolution and the debugging kernel console are compiled in, down to 198~KiB for a non-debugging
production build. But we do not believe that the raw size of the microkernel is a relevant quality
criterion per se, without taking the actual feature set into account.
\medskip
To sum up, the choice of HelenOS as our case study is based on the fact that it was not designed
in advance with formal verification in mind. This is similar to most general-purpose operating
systems in common use. At the same time, it does not have an obsolete design and is non-trivial.
in advance with formal verification in mind (some of the design principles might be beneficial,
but others might be disadvantageous), but the design of HelenOS is also non-trivial and not obsolete.
\subsection{The C Programming Language}
A large majority of operating systems is coded in the C programming language. HelenOS is no exception
to this. The choice of C in the case of kernel is usually well-motivated -- the language was designed
specifically for implementing system software~\cite{c}. It is reasonably low-level in the sense that it allows
to access the memory and other hardware resources with similar effectiveness as from assembler.
A large majority of OSes is coded in the C programming language (HelenOS is no exception
to this). The choice of C in the case of kernel is usually well-motivated, since the C language was designed
specifically for implementing system software~\cite{c}: It is reasonably low-level in the sense that it allows
to access the memory and other hardware resources with similar effectiveness as from assembler;
It also requires almost no run-time support and it exports many features of the von Neumann hardware
architecture to the programmer in a very straightforward, but still relatively portable way.
239,7 → 227,7
However, what is the biggest advantage of C in terms of run-time performance is also the biggest weakness
for formal reasoning. The permissive memory access model of C, the lack of any reference safety
enforcement, the weak type system and generally little semantic information in the code -- all these
properties do not allow to make many general assumptions about the code.
properties that do not allow to make many general assumptions about the code.
Programming languages which target controlled environments such as Java or C\(\sharp\) are
generally easier for formal reasoning because they provide a well-known set of primitives
250,12 → 238,12
\medskip
The reasons for using C in the user space of HelenOS (and other established operating systems) is
probably much more questionable. Except for the core libraries and services (such as device drivers),
C might be easily replaced by any high-level and perhaps even non-imperative programming language.
The reasons for using C in this context is mostly historical.
The reasons for frequent use of C in the user space of many established OSes (and HelenOS) is
probably much more questionable. In the case of HelenOS, except for the core libraries and tasks
(such as device drivers), C might be easily replaced by any high-level and perhaps even
non-imperative programming language. The reasons for using C in this context are mostly historical.
However, as we have stated in Section \ref{introduction}, the way general-purpose operating systems
However, as we have stated in Section \ref{introduction}, the way general-purpose OSes
are implemented changes only slowly and therefore any propositions which require radical modification
of the existing code base before committing to the formal verification are not realistic.
274,13 → 262,27
operating system. Each set of properties usually requires a specific verification method,
tool or toolchain.
Our approach will be mostly bottom-to-top, or, in other words, from the lower levels of abstraction
Our approach will be mostly bottom-up, or, in other words, from the lower levels of abstraction
to the higher levels of abstraction. If the verification fails on a lower level, it usually
does not make much sense to continue with the higher levels of abstraction until the issues
are tackled. A structured overview of the formalisms and tools can be seen on Figure \ref{fig:diag}.
are tackled. A structured overview of the formalisms, methods and tools can be seen on
Figure \ref{fig:diag}.
\medskip
Some of the proposed methods cannot be called ``formal methods'' in the rigorous understanding
of the term. However, even methods which are based on semi-formal reasoning and non-exhaustive
testing provide some limited guarantees in their specific context. A valued property
of the formal methods is to preserve these limited guarantees even on the higher levels
of abstraction, thus complementing the formal guarantees where the formal methods do not provide
any feasible verification so far. This increases the coverage of the set of all hypothetical
interesting properties of the system (although it is probably impossible to formally define
this set).
\medskip
Please note that the titles of the following sections do not follow any particular established
taxonomy of verification methods. We have simply chosen the names to be descriptive.
taxonomy. We have simply chosen the names to be intuitivelly descriptive.
\subsection{C Language Compiler and Continuous Integration Tool}
\label{clang}