WebSVN – HelenOS-doc – Diff – /papers/isarcs10/isarcs10.tex

 \documentclass{llncs}
 \usepackage{graphicx}
 \title{A Road to a Formally Verified General-Purpose Operating System}
 \author{Martin D\v{e}ck\'{y}}
-\institute{Department of Software Engineering\\
+\institute{Department of Distributed and Dependable Systems\\
         Faculty of Mathematics and Physics, Charles University\\
         Malostransk\'{e} n\'{a}m\v{e}st\'{i} 25, Prague 1, 118~00, Czech Republic\\
-        \email{martin.decky@dsrg.mff.cuni.cz}}
+        \email{martin.decky@d3s.mff.cuni.cz}}
 \begin{document}
     \maketitle
     \begin{abstract}
         Methods of formal description and verification represent a viable way for achieving
         fundamentally bug-free software. However, in reality only a small subset of the existing operating
         systems were ever formally verified, despite the fact that an operating system is a critical part
-        of almost any other software system. This paper summarizes the challenges involved in formal verification
+        of almost any other software system. This paper points out several key design choices which
-        of operating systems, points out several key design choices which should make the formal verification
+        should make the formal verification of an operating system easier and presents a work-in-progress
-        easier and presents a work-in-progress and initial experiences with formal verification of HelenOS,
+        and initial experiences with formal verification of HelenOS, a state-of-the-art microkernel-based
-        a state-of-the-art microkernel-based operating system, which, however, was not designed specifically
+        operating system, which, however, was not designed specifically with formal verification in mind,
-        with formal verification in mind, as this is mostly prohibitive due to time and budget constrains.
+        as this is mostly prohibitive due to time and budget constrains.
         The contribution of this paper is the shift of focus from attempts to use a single ``silver-bullet''
         formal verification method which would be able to verify everything to a combination of multiple
-        formalisms and techniques to cover various aspects of the verification process. The formally verified
+        formalisms and techniques to successfully cover various aspects of the operating system.
-        operating system is the emerging property of the combination.
+        A reliable and dependable operating system is the emerging property of the combination,
+        thanks to the suitable architecture of the operating system.
     \end{abstract}
     \section{Introduction}
         \label{introduction}
-        In the context of formal verification of software, it is always important to model the outer
+        Operating systems (OSes for short) have a somewhat special position among all software.
-        environment with a proper level of abstraction. Weak assumptions on the environment make the formal
-        verification less feasible, because they yield many degrees of freedom which render
+        OSes are usually designed to run on bare hardware. This means that they do not require
-        the properties which we want to verify overly complex. But strong assumptions on the environment are likewise
+        any special assumptions on the environment except the assumptions on the properties and
-        not without a price: Any practical usability of the verification results depends on the question
+        behavior of hardware. In many cases it is perfectly valid to consider the hardware
-        whether we are really able to create a physical environment which satisfies the assumptions.
+        as \emph{idealized hardware} (zero mathematical probability of failure, perfect compiance
-        There are some assumptions which are universal for any formal verification method. For the sake of
+        with the specifications, etc.). This means that it is solely the OS that defines the
-        simplicity let us call them \emph{idealized hardware}. Let us also cover the computational model
-        and basic mathematical principles by this term.
+        environment for other software.
-        What is \emph{idealized hardware}? We usually assume that the software is executed by a ``perfect'' hardware
+        OSes represent the lowest software layer and provide services to essentially all other
-        which works exactly according to its specification. Every divergence between the intended and actual
+        software. Considering the principle of recursion, the properties of an OS form the
-        behavior of software counts as a software bug and never as an issue in some hardware component. We
+        assumptions for the upper layers of software. Thus the dependability of end-user and
-        implicitly assume that the hardware is perfect in the sense of physical realization and all probabilities
-        of accidental mechanical and electrical failures are expected to be zero.
+        enterprise software systems is always limited by the dependability of the OS.
-        It is trivial to see that the \emph{idealized hardware} assumptions can never hold in real world.
-        Accidental mechanical and electrical failures are inevitable in principle and their probability can
+        Finally, OSes are non-trivial software on their own. The way they are generally designed
-        never be a mathematical zero. Does this mean that any formal verification method is ultimately futile
+        and programmed (spanning both the kernel and user mode, manipulating execution contexts
-        since we are unable to ensure validity of the most elementary assumptions? Probably not. We just need
+        and concurrency, handling critical hardware-related operations) represent significant
-        to be aware about the inherent and fundamental limitations of the formal methods which result from
-        the shortcomings of the physical world.
+        and interesting challenges for software analysis.
-        The more precisely we model the software environment of the software under discussion, the more precisely
-        we can calculate or estimate the resulting impact of the imperfections of the physical world on our software
-        system. This is the most important motivation for formal reasoning about the correctness of the operating system.
         \medskip
-        In more detail, operating systems have a somewhat special position among all software:
-        \begin{itemize}
-            \item \emph{Operating systems are usually designed to run on bare hardware.} This means that except
-                  the \emph{idealized hardware} assumptions we do not have to take any or almost any extra assumptions
-                  into account.
-            \item \emph{Operating systems create the lowest software layer and provide services to essentially
+        These are probably the most important reasons that led to several research initiatives
-                  all other software.} Considering the principle of recursion, the properties of the
-                  operating systems which we prove or disprove form the assumptions for the upper layers
+        in the recent years which target the creation of a formally verified OSes from scratch
-                  of software. Thus the dependability of end-user and enterprise software systems is limited
-                  by the dependability of the operating system.
-            \item \emph{Operating systems are non-trivial software on their own.} The way
+        (e.g. \cite{seL4}). Methods of formal description and verification provide fundamentally
-                  they are generally designed and programmed (spanning both the kernel and user mode,
+        better guarantees of desirable properties than non-exhaustive engineering methods such
-                  manipulating execution contexts and concurrency, handling critical hardware-related
-                  operations) represent significant and interesting challenges for the formal verification
-                  methods and tools.
-        \end{itemize}
+        as testing.
-        Even in the informal understanding, the dependability of an operating system greatly determines the perceived
+        However, 98~\%\footnote{98~\% of client computers connected to the Internet as of January
-        dependability of the entire software stack. This led to several research initiatives in the recent years
+~\cite{marketshare}.} of the market share of general-purpose OSes is taken
-        which target the creation of a formally verified operating systems from scratch~\cite{seL4}.
-        However, 98~\%\footnote{98~\% of client computers connected to the Internet as of January 2010~\cite{marketshare}.} of
-        the market share of general-purpose operating systems is taken by Windows, Mac~OS~X and Linux.
+        by Windows, Mac~OS~X and Linux. These systems were clearly not designed with formal
-        These systems were clearly not designed with formal verification in mind from the very beginning.
+        verification in mind from the very beginning. The situation on the embedded, real-time
-        The situation on the embedded, real-time and special-purpose operating systems market is probably different,
+        and special-purpose OSes market is probably different, but it is unlikely that the
-        but it is unlikely that the situation in the large domain of desktops and servers is going to change
+        segmentation of the desktop and server OSes market is going to change very rapidly
-        very rapidly in the near future.
+        in the near future.
-        Therefore we need to face the challenges of applying formal methods on an existing code base in the domain
+        The architecture of these major desktop and server OSes is monolithic, which makes
-        of general-purpose operating systems. Fortunately, the software design qualities of the general-purpose
+        any attempts to do formal verification on them extremely challenging due to the large
-        operating systems gradually improve over time. We can see then novel approaches in the operating systems
+        state space. Fortunatelly we can observe that aspects of several novel approaches from
-        research from the late 1980s and 1990s (microkernel design, user space file system and device drivers, etc.)
+        the OS research from the late 1980s and early 1990s (microkernel design, user space
-        to slowly emerge in the originally monolithic operating systems. We can also see better code quality thanks
+        file system and device drivers, etc.) are slowly emerging in these originally purely
-        to improved software engineering (code review, proper escalation management, etc.).
+        monolithic implementations.
         \medskip
-        This paper proposes an approach and presents a work-in-progress case study of formal verification
+        In this paper we show how specific design choices can markedly improve the feasibility
-        of an general-purpose research operating system, which was also not created specifically with formal
+        of verification of an OS, even if the particular OS was not designed
-        verification in mind from the very beginning, but it was designed according to state-of-the-art operating systems
+        specifically with formal verification in mind. These design choices can be gradually
+        introduced (and in fact some of them have already been introduced) to mainstream
-        principles.
+        general-purpose OSes.
+        Our approach is not based on using a single ``silver-bullet'' formalism, methodology or
+        tool, but on combining various enginering, semi-formal and formal approaches.
+        While the lesser formal approaches give lesser guarantees, they can complement
+        the formal approaches on their boundaries and increase the coverage of the set of
+        all hypothetical interesting properties of the system.
+        We also demonstrate work-in-progress case study of an general-purpose research OS
+        that was not created specifically with formal verification in mind from the very
+        beginning, but that was designed according to state-of-the-art OS principles.
         \medskip
-        \noindent\textbf{Structure of the Paper.} In Section \ref{context} we introduce the case study in more detail and explain
+        \noindent\textbf{Structure of the Paper.} In Section \ref{design} we introduce
-        why we believe it is relevant. In Section \ref{analysis} we discuss our proposal of the combination of formal methods
+        the design choices and our case study in more detail. In Section \ref{analysis} we
-        and tools. In Section \ref{evaluation} we present the preliminary evaluation of our efforts and estimate the complexity
+        discuss our approach of the combination of methods and tools. In Section \ref{evaluation}
-        of the imminent next steps we want to take according to our proposal. Finally, in Section \ref{conclusion}
+        we present a preliminary evaluation of our efforts and propose the imminent next steps
-        we present the conclusion of the paper.
+        to take. Finally, in Section \ref{conclusion} we present the conclusion of the paper.
-    \section{Context}
+    \section{Operating Systems Design}
-        \label{context}
+        \label{design}
-        Two very common schemes of operating system design are \emph{monolithic design} and \emph{microkernel design}.
+        Two very common schemes of OS design are \emph{monolithic design} and \emph{microkernel design}.
-        Without going into much detail of any specific operating system, we can define the monolithic design as
+        Without going into much detail of any specific implementation, we can define the monolithic design as
-        a preference to put numerous aspects of the core operating system functionality into the kernel,
+        a preference to put numerous aspects of the core OS functionality into the kernel, while microkernel
-        while microkernel design is a preference to keep the kernel small, with just a minimal set of features.
+        design is a preference to keep the kernel small, with just a minimal set of features.
-        The features which are missing from the kernel in the microkernel design are implemented in user space, usually
+        The features which are missing from the kernel in the microkernel design are implemented
-        by means of libraries, servers and/or software components.
+        in user space, usually by means of libraries, servers (e.g. processes/tasks) and/or software components.
         \subsection{HelenOS}
             \label{helenos}
-            \emph{HelenOS} is a general-purpose research operating system which is being developed at Charles
+            \emph{HelenOS} is a general-purpose research OS which is being developed at Charles
             University in Prague. The source code is available under the BSD open source license and can be
             freely downloaded from the project web site~\cite{helenos}. The authors of the code base are
             both from the academia and from the open source community (several contributors are employed
-            as Solaris kernel developers and many are freelance IT professionals). We consistently strive to support
+            as Solaris kernel developers and many are freelance IT professionals).
-            the research and also the practical motivations for developing the system.
             HelenOS uses a preemptive priority-feedback scheduler, it supports SMP hardware and it is
             designed to be highly portable. Currently it runs on 7 distinct hardware architectures, including the
             most common IA-32, x86-64 (AMD64), IA-64, SPARC~v9 and PowerPC. It also runs on ARMv7 and MIPS,
             but currently only in simulators and not on physical hardware.
             Although HelenOS is still far from being an everyday replacement for Linux or Windows due to the lack
             of end-user applications (whose development is extremely time-consuming, but unfortunately of
             no scientific value), the essential foundations such as file system support and TCP/IP networking
             are already in place.
-            HelenOS does not currently target embedded devices (although the ARMv7 port can be easily modified to
+            HelenOS does not currently target embedded devices (although the ARMv7 port can be very easily
-            run on various embedded boards) and does not implement real-time scheduling and synchronization.
+            modified to run on various embedded boards) and does not implement real-time features.
-            Many development projects such as task snapshoting and migration, generic device driver
+            Many development projects such as task snapshoting and migration, support for MMU-less
-            framework, support for MMU-less platforms and performance monitoring are currently underway.
+            platforms and performance monitoring are currently underway.
             \medskip
-            HelenOS has a microkernel multiserver design, but the guiding principles of the HelenOS design are
+            HelenOS can be briefly described as microkernel multiserver design. However, the actual design
-            actually more elaborate:
+            guiding principles of the HelenOS are more elaborate:
-            \begin{itemize}
+            \begin{description}
-                \item \emph{(Microkernel principle)} Every functionality of the operating system that does not
+                \item[Microkernel principle] Every functionality of the OS that does not
                       have to be necessary implemented in the kernel should be implemented in user space. This
                       implies that subsystems such as the file system, device drivers (except those which are
                       essential for the basic kernel functionality), naming and trading services, networking,
                       human interface and similar features should be implemented in user space.
-                \item \emph{(Full-fledged principle)} Features which need to be implemented in kernel should
+                \item[Full-fledged principle] Features which need to be placed in kernel should
-                      be designed with full-fledged algorithms and data structures. In contrast
+                      be implemented by full-fledged algorithms and data structures. In contrast
-                      to several other microkernel operating systems, where the authors have deliberately chosen
+                      to several other microkernel OSes, where the authors have deliberately chosen
                       the most simplistic approach (static memory allocation, na\"{\i}ve algorithms, simple data
                       structures), HelenOS microkernel tries to use the most advanced and suitable means available.
                       It contains features such as AVL and B+ trees, hashing tables, SLAB memory allocator, multiple
                       in-kernel synchronization primitives, fine-grained locking and so on.
-                \item \emph{(Multiserver principle)} Subsystems in user space should be decomposed with the smallest
+                \item[Multiserver principle] Subsystems in user space should be decomposed with the smallest
                       reasonable granularity. Each unit of decomposition should be encapsulated in a separate task.
                       The tasks represent software components with isolated address spaces. From the design point of
                       view the kernel can be seen as a separate component, too.
-                \item \emph{(Split of mechanism and policy)} The kernel should only provide low-level mechanisms,
+                \item[Split of mechanism and policy] The kernel should only provide low-level mechanisms,
                       while the high-level policies which are built upon these mechanisms should be defined in
                       user space. This also implies that the policies should be easily replaceable while keeping
                       the low-level mechanisms intact.
-                \item \emph{(Encapsulation principle)} The communication between the tasks/components should be
+                \item[Encapsulation principle] The communication between the tasks/components should be
                       implemented only via a set of well-defined interfaces. In the user-to-user case the preferred
                       communication mechanism is HelenOS IPC, which provides reasonable mix of abstraction and
                       performance (RPC-like primitives combined with implicit memory sharing for large data
                       transfers). In case of synchronous user-to-kernel communication the usual syscalls are used.
                       HelenOS IPC is used again for asynchronous kernel-to-user communication.
-                \item \emph{(Portability principle)} The design and implementation should always maintain a high
+                \item[Portability principle] The design and implementation should always maintain a high
                       level of platform neutrality and portability. Platform-specific code in the kernel, core
-                      libraries and tasks implementing low-level functionality (e.g. device drivers) should be
+                      libraries and tasks implementing device drivers should be clearly separated from the
-                      clearly separated from the generic code (either by component decomposition, naming and trading,
-                      or at least by internal compile-time APIs).
+                      generic code (either by component decomposition or at least by internal compile-time APIs).
-            \end{itemize}
+            \end{description}
-            As these design guiding principles suggest, the size of the HelenOS microkernel is considerably larger
+            In Section \ref{analysis} we argue that several of these design principles significantly improve
-            compared to ``scrupulous'' microkernel implementations. The average footprint of the kernel ranges from
-~KiB when all logging messages, asserts, symbol resolution and debugging kernel console are compiled
-            in, down to 198~KiB for a non-debugging production build. We do not believe that the raw size
+            the feasibility of formal verification of the entire system. On the other hand, other design principles
-            of the microkernel is a relevant quality criterion per se, without taking the actual feature set
+            induce new interesting challenges for formal description and verification.
-            into account.
-            \medskip
             The run-time architecture of HelenOS is inherently dynamic. The bindings between the components are
-            not created at compile-time, but during the bootstrap process and can be modified to a large degree
+            not created at compile-time, but during bootstrap and can be modified to a large degree also during
-            also during normal operation mode of the system (via human interaction and external events). This
+            normal operation mode of the system (via human interaction and external events).
-            creates particularly interesting challenges for describing the design of the system by many formalisms.
-            \medskip
-            Yet another set of obstacles for reasoning about the properties of HelenOS lies in the design of
+            The design of the ubiquitous HelenOS IPC mechanism and the associated threading model present
-            the ubiquitous HelenOS IPC mechanism and the associated threading model. The IPC is inherently
+            the possibility to significantly reduce the size of the state space which needs to be explored
+            by formal verification tools, but at the same time it is quite hard to express these
-            asynchronous with constant message buffers in the kernel and dynamic buffers in user space.
+            constrains in many formalisms. The IPC is inherently asynchronous with constant message buffers
+            in the kernel and dynamic buffers in user space. It uses the notions of uni-directional bindings,
-            It uses the notions of uni-directional bindings, mandatory pairing of requests and replies,
+            mandatory pairing of requests and replies, binding establishment and abolishment handshakes,
-            binding establishment and abolishment handshakes, memory sharing and fast message forwarding.
+            memory sharing and fast message forwarding.
             For easier management of the asynchronous messages and the possibility to process multiple
             messages from different peers without the usual kernel threading overhead, the core user space
-            library manages the execution flow by so-called \emph{fibrils}. A fibril is a user-space-managed thread with
+            library manages the execution flow by so-called \emph{fibrils}. A fibril is a user-space-managed
-            cooperative scheduling. A different fibril is scheduled every time the current fibril is
+            thread with cooperative scheduling. A different fibril is scheduled every time the current fibril
-            about to be blocked while sending out IPC requests (because the kernel buffers of the addressee
+            is about to be blocked while sending out IPC requests (because the kernel buffers of the addressee
             are full) or while waiting on an IPC reply. This allows different execution flows within the
-            same thread to process multiple requests and replies. To safeguard proper sequencing
+            same thread to process multiple requests and replies. To safeguard proper sequencing of IPC
-            of IPC messages and provide synchronization, special fibril-aware synchronization primitives
+            messages and provide synchronization, special fibril-aware synchronization primitives
             (mutexes, condition variables, etc.) are available.
             Because of the cooperative nature of fibrils, they might cause severe performance under-utilization
             in SMP configurations and system-wide bottlenecks. As multicore processors are more and more
             common nowadays, that would be a substantial design flaw. Therefore the fibrils can be also freely
             preemptive scheduling and true parallelism on SMP machines. Needless to say, this combination is
             also a grand challenge for the formal reasoning.
             \medskip
+            Incidentally, the \emph{full-fledged principle} causes that the size of the HelenOS microkernel is
+            considerably larger compared to other ``scrupulous'' microkernel implementations. The average
+            footprint of the kernel on IA-32 ranges from 569~KiB when all logging messages, asserts, symbol
+            resolution and the debugging kernel console are compiled in, down to 198~KiB for a non-debugging
+            production build. But we do not believe that the raw size of the microkernel is a relevant quality
+            criterion per se, without taking the actual feature set into account.
+            \medskip
             To sum up, the choice of HelenOS as our case study is based on the fact that it was not designed
-            in advance with formal verification in mind. This is similar to most general-purpose operating
+            in advance with formal verification in mind (some of the design principles might be beneficial,
-            systems in common use. At the same time, it does not have an obsolete design and is non-trivial.
+            but others might be disadvantageous), but the design of HelenOS is also non-trivial and not obsolete.
         \subsection{The C Programming Language}
-            A large majority of operating systems is coded in the C programming language. HelenOS is no exception
+            A large majority of OSes is coded in the C programming language (HelenOS is no exception
-            to this. The choice of C in the case of kernel is usually well-motivated -- the language was designed
+            to this). The choice of C in the case of kernel is usually well-motivated, since the C language was designed
-            specifically for implementing system software~\cite{c}. It is reasonably low-level in the sense that it allows
+            specifically for implementing system software~\cite{c}: It is reasonably low-level in the sense that it allows
-            to access the memory and other hardware resources with similar effectiveness as from assembler.
+            to access the memory and other hardware resources with similar effectiveness as from assembler;
             It also requires almost no run-time support and it exports many features of the von Neumann hardware
             architecture to the programmer in a very straightforward, but still relatively portable way.
             However, what is the biggest advantage of C in terms of run-time performance is also the biggest weakness
             for formal reasoning. The permissive memory access model of C, the lack of any reference safety
             enforcement, the weak type system and generally little semantic information in the code -- all these
-            properties do not allow to make many general assumptions about the code.
+            properties that do not allow to make many general assumptions about the code.
             Programming languages which target controlled environments such as Java or C\(\sharp\) are
             generally easier for formal reasoning because they provide a well-known set of primitives
             and language constructs for object ownership, threading and synchronization. Many non-imperative
             programming languages can be even considered to be a form of ``executable specification'' and thus
             very suitable for formal reasoning. In C, almost everything is left to the programmer who
             is free to set the rules.
             \medskip
-            The reasons for using C in the user space of HelenOS (and other established operating systems) is
+            The reasons for frequent use of C in the user space of many established OSes (and HelenOS) is
-            probably much more questionable. Except for the core libraries and services (such as device drivers),
+            probably much more questionable. In the case of HelenOS, except for the core libraries and tasks
-            C might be easily replaced by any high-level and perhaps even non-imperative programming language.
+            (such as device drivers), C might be easily replaced by any high-level and perhaps even
-            The reasons for using C in this context is mostly historical.
+            non-imperative programming language. The reasons for using C in this context are mostly historical.
-            However, as we have stated in Section \ref{introduction}, the way general-purpose operating systems
+            However, as we have stated in Section \ref{introduction}, the way general-purpose OSes
             are implemented changes only slowly and therefore any propositions which require radical modification
             of the existing code base before committing to the formal verification are not realistic.
     \section{Analysis}
         \label{analysis}
         In this section, we analyze the properties we would like to check in a general-purpose
         operating system. Each set of properties usually requires a specific verification method,
         tool or toolchain.
-        Our approach will be mostly bottom-to-top, or, in other words, from the lower levels of abstraction
+        Our approach will be mostly bottom-up, or, in other words, from the lower levels of abstraction
         to the higher levels of abstraction. If the verification fails on a lower level, it usually
         does not make much sense to continue with the higher levels of abstraction until the issues
-        are tackled. A structured overview of the formalisms and tools can be seen on Figure \ref{fig:diag}.
+        are tackled. A structured overview of the formalisms, methods and tools can be seen on
+        Figure \ref{fig:diag}.
+        \medskip
+        Some of the proposed methods cannot be called ``formal methods'' in the rigorous understanding
+        of the term. However, even methods which are based on semi-formal reasoning and non-exhaustive
+        testing provide some limited guarantees in their specific context. A valued property
+        of the formal methods is to preserve these limited guarantees even on the higher levels
+        of abstraction, thus complementing the formal guarantees where the formal methods do not provide
+        any feasible verification so far. This increases the coverage of the set of all hypothetical
+        interesting properties of the system (although it is probably impossible to formally define
+        this set).
+        \medskip
         Please note that the titles of the following sections do not follow any particular established
-        taxonomy of verification methods. We have simply chosen the names to be descriptive.
+        taxonomy. We have simply chosen the names to be intuitivelly descriptive.
         \subsection{C Language Compiler and Continuous Integration Tool}
             \label{clang}
             The initial levels of abstraction do not go far from the C source code. First, we would certainly like to
             know whether our code base is compliant with the programming language specification and passes

Subversion Repositories HelenOS-doc

(root)/papers/isarcs10/isarcs10.tex – Rev 179 → 180