Subversion Repositories HelenOS-doc

Rev

Rev 182 | Go to most recent revision | Only display areas with differences | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed

Rev 182 Rev 183
1
\documentclass{llncs}
1
\documentclass{llncs}
2
\usepackage{graphicx}
2
\usepackage{graphicx}
3
 
3
 
4
\title{A Road to a Formally Verified General-Purpose Operating System}
4
\title{A Road to a Formally Verified General-Purpose Operating System}
5
\author{Martin D\v{e}ck\'{y}}
5
\author{Martin D\v{e}ck\'{y}}
6
\institute{Department of Distributed and Dependable Systems\\
6
\institute{Department of Distributed and Dependable Systems\\
7
        Faculty of Mathematics and Physics, Charles University\\
7
        Faculty of Mathematics and Physics, Charles University\\
8
        Malostransk\'{e} n\'{a}m\v{e}st\'{i} 25, Prague 1, 118~00, Czech Republic\\
8
        Malostransk\'{e} n\'{a}m\v{e}st\'{i} 25, Prague 1, 118~00, Czech Republic\\
9
        \email{martin.decky@d3s.mff.cuni.cz}}
9
        \email{martin.decky@d3s.mff.cuni.cz}}
10
 
10
 
11
\begin{document}
11
\begin{document}
12
    \maketitle
12
    \maketitle
13
   
13
   
14
    \begin{abstract}
14
    \begin{abstract}
15
        Methods of formal description and verification represent a viable way for achieving
15
        Methods of formal description and verification represent a viable way for achieving
16
        fundamentally bug-free software. However, in reality only a small subset of the existing operating
16
        fundamentally bug-free software. However, in reality only a small subset of the existing operating
17
        systems were ever formally verified, despite the fact that an operating system is a critical part
17
        systems were ever formally verified, despite the fact that an operating system is a critical part
18
        of almost any other software system. This paper points out several key design choices which
18
        of almost any other software system. This paper points out several key design choices which
19
        should make the formal verification of an operating system easier and presents a work-in-progress
19
        should make the formal verification of an operating system easier and presents a work-in-progress
20
        and initial experiences with formal verification of HelenOS, a state-of-the-art microkernel-based
20
        and initial experiences with formal verification of HelenOS, a state-of-the-art microkernel-based
21
        operating system, which, however, was not designed specifically with formal verification in mind,
21
        operating system, which, however, was not designed specifically with formal verification in mind,
22
        as this is mostly prohibitive due to time and budget constrains.
22
        as this is mostly prohibitive due to time and budget constrains.
23
       
23
       
24
        The contribution of this paper is the shift of focus from attempts to use a single ``silver-bullet''
24
        The contribution of this paper is the shift of focus from attempts to use a single ``silver-bullet''
25
        formal verification method which would be able to verify everything to a combination of multiple
25
        formal verification method which would be able to verify everything to a combination of multiple
26
        formalisms and techniques to successfully cover various aspects of the operating system.
26
        formalisms and techniques to successfully cover various aspects of the operating system.
27
        A reliable and dependable operating system is the emerging property of the combination,
27
        A reliable and dependable operating system is the emerging property of the combination,
28
        thanks to the suitable architecture of the operating system.
28
        thanks to the suitable architecture of the operating system.
29
    \end{abstract}
29
    \end{abstract}
30
   
30
   
31
    \section{Introduction}
31
    \section{Introduction}
32
        \label{introduction}
32
        \label{introduction}
33
        Operating systems (OSes for short) have a somewhat special position among all software.
33
        Operating systems (OSes for short) have a somewhat special position among all software.
34
        OSes are usually designed to run on bare hardware. This means that they do not require
34
        OSes are usually designed to run on bare hardware. This means that they do not require
35
        any special assumptions on the environment except the assumptions on the properties and
35
        any special assumptions on the environment except the assumptions on the properties and
36
        behavior of hardware. In many cases it is perfectly valid to consider the hardware
36
        behavior of hardware. In many cases it is perfectly valid to consider the hardware
37
        as \emph{idealized hardware} (zero mathematical probability of failure, perfect compiance
37
        as \emph{idealized hardware} (zero mathematical probability of failure, perfect compliance
38
        with the specifications, etc.). This means that it is solely the OS that defines the
38
        with the specifications, etc.). This means that it is solely the OS that defines the
39
        environment for other software.
39
        environment for other software.
40
       
40
       
41
        OSes represent the lowest software layer and provide services to essentially all other
41
        OSes represent the lowest software layer and provide services to essentially all other
42
        software. Considering the principle of recursion, the properties of an OS form the
42
        software. Considering the principle of recursion, the properties of an OS form the
43
        assumptions for the upper layers of software. Thus the dependability of end-user and
43
        assumptions for the upper layers of software. Thus the dependability of end-user and
44
        enterprise software systems is always limited by the dependability of the OS.
44
        enterprise software systems is always limited by the dependability of the OS.
45
       
45
       
46
        Finally, OSes are non-trivial software on their own. The way they are generally designed
46
        Finally, OSes are non-trivial software on their own. The way they are generally designed
47
        and programmed (spanning both the kernel and user mode, manipulating execution contexts
47
        and programmed (spanning both the kernel and user mode, manipulating execution contexts
48
        and concurrency, handling critical hardware-related operations) represent significant
48
        and concurrency, handling critical hardware-related operations) represent significant
49
        and interesting challenges for software analysis.
49
        and interesting challenges for software analysis.
50
       
50
       
51
        \medskip
51
        \medskip
52
       
52
       
53
        These are probably the most important reasons that led to several research initiatives
53
        These are probably the most important reasons that led to several research initiatives
54
        in the recent years which target the creation of a formally verified OSes from scratch
54
        in the recent years which target the creation of a formally verified OSes from scratch
55
        (e.g. \cite{seL4}). Methods of formal description and verification provide fundamentally
55
        (e.g. \cite{seL4}). Methods of formal description and verification provide fundamentally
56
        better guarantees of desirable properties than non-exhaustive engineering methods such
56
        better guarantees of desirable properties than non-exhaustive engineering methods such
57
        as testing.
57
        as testing.
58
       
58
       
59
        However, 98~\%\footnote{98~\% of client computers connected to the Internet as of January
59
        However, 98~\%\footnote{98~\% of client computers connected to the Internet as of January
60
        2010~\cite{marketshare}.} of the market share of general-purpose OSes is taken
60
        2010~\cite{marketshare}.} of the market share of general-purpose OSes is taken
61
        by Windows, Mac~OS~X and Linux. These systems were clearly not designed with formal
61
        by Windows, Mac~OS~X and Linux. These systems were clearly not designed with formal
62
        verification in mind from the very beginning. The situation on the embedded, real-time
62
        verification in mind from the very beginning. The situation on the embedded, real-time
63
        and special-purpose OSes market is probably different, but it is unlikely that the
63
        and special-purpose OSes market is probably different, but it is unlikely that the
64
        segmentation of the desktop and server OSes market is going to change very rapidly
64
        segmentation of the desktop and server OSes market is going to change very rapidly
65
        in the near future.
65
        in the near future.
66
       
66
       
67
        The architecture of these major desktop and server OSes is monolithic, which makes
67
        The architecture of these major desktop and server OSes is monolithic, which makes
68
        any attempts to do formal verification on them extremely challenging due to the large
68
        any attempts to do formal verification on them extremely challenging due to the large
69
        state space. Fortunatelly we can observe that aspects of several novel approaches from
69
        state space. Fortunately we can observe that aspects of several novel approaches from
70
        the OS research from the late 1980s and early 1990s (microkernel design, user space
70
        the OS research from the late 1980s and early 1990s (microkernel design, user space
71
        file system and device drivers, etc.) are slowly emerging in these originally purely
71
        file system and device drivers, etc.) are slowly emerging in these originally purely
72
        monolithic implementations.
72
        monolithic implementations.
73
       
73
       
74
        \medskip
74
        \medskip
75
       
75
       
76
        In this paper we show how specific design choices can markedly improve the feasibility
76
        In this paper we show how specific design choices can markedly improve the feasibility
77
        of verification of an OS, even if the particular OS was not designed
77
        of verification of an OS, even if the particular OS was not designed
78
        specifically with formal verification in mind. These design choices can be gradually
78
        specifically with formal verification in mind. These design choices can be gradually
79
        introduced (and in fact some of them have already been introduced) to mainstream
79
        introduced (and in fact some of them have already been introduced) to mainstream
80
        general-purpose OSes.
80
        general-purpose OSes.
81
       
81
       
82
        Our approach is not based on using a single ``silver-bullet'' formalism, methodology or
82
        Our approach is not based on using a single ``silver-bullet'' formalism, methodology or
83
        tool, but on combining various enginering, semi-formal and formal approaches.
83
        tool, but on combining various engineering, semi-formal and formal approaches.
84
        While the lesser formal approaches give lesser guarantees, they can complement
84
        While the lesser formal approaches give lesser guarantees, they can complement
85
        the formal approaches on their boundaries and increase the coverage of the set of
85
        the formal approaches on their boundaries and increase the coverage of the set of
86
        all hypothetical interesting properties of the system.
86
        all hypothetical interesting properties of the system.
87
       
87
       
88
        We also demonstrate work-in-progress case study of an general-purpose research OS
88
        We also demonstrate work-in-progress case study of an general-purpose research OS
89
        that was not created specifically with formal verification in mind from the very
89
        that was not created specifically with formal verification in mind from the very
90
        beginning, but that was designed according to state-of-the-art OS principles.
90
        beginning, but that was designed according to state-of-the-art OS principles.
91
       
91
       
92
        \medskip
92
        \medskip
93
       
93
       
94
        \noindent\textbf{Structure of the Paper.} In Section \ref{design} we introduce
94
        \noindent\textbf{Structure of the Paper.} In Section \ref{design} we introduce
95
        the design choices and our case study in more detail. In Section \ref{analysis} we
95
        the design choices and our case study in more detail. In Section \ref{analysis} we
96
        discuss our approach of the combination of methods and tools. In Section \ref{evaluation}
96
        discuss our approach of the combination of methods and tools. In Section \ref{evaluation}
97
        we present a preliminary evaluation of our efforts and propose the imminent next steps
97
        we present a preliminary evaluation of our efforts and propose the imminent next steps
98
        to take. Finally, in Section \ref{conclusion} we present the conclusion of the paper.
98
        to take. Finally, in Section \ref{conclusion} we present the conclusion of the paper.
99
   
99
   
100
    \section{Operating Systems Design}
100
    \section{Operating Systems Design}
101
        \label{design}
101
        \label{design}
102
        Two very common schemes of OS design are \emph{monolithic design} and \emph{microkernel design}.
102
        Two very common schemes of OS design are \emph{monolithic design} and \emph{microkernel design}.
103
        Without going into much detail of any specific implementation, we can define the monolithic design as
103
        Without going into much detail of any specific implementation, we can define the monolithic design as
104
        a preference to put numerous aspects of the core OS functionality into the kernel, while microkernel
104
        a preference to put numerous aspects of the core OS functionality into the kernel, while microkernel
105
        design is a preference to keep the kernel small, with just a minimal set of features.
105
        design is a preference to keep the kernel small, with just a minimal set of features.
106
       
106
       
107
        The features which are missing from the kernel in the microkernel design are implemented
107
        The features which are missing from the kernel in the microkernel design are implemented
108
        in user space, usually by means of libraries, servers (e.g. processes/tasks) and/or software components.
108
        in user space, usually by means of libraries, servers (e.g. processes/tasks) and/or software components.
109
       
109
       
110
        \subsection{HelenOS}
110
        \subsection{HelenOS}
111
            \label{helenos}
111
            \label{helenos}
112
            \emph{HelenOS} is a general-purpose research OS which is being developed at Charles
112
            \emph{HelenOS} is a general-purpose research OS which is being developed at Charles
113
            University in Prague. The source code is available under the BSD open source license and can be
113
            University in Prague. The source code is available under the BSD open source license and can be
114
            freely downloaded from the project web site~\cite{helenos}. The authors of the code base are
114
            freely downloaded from the project web site~\cite{helenos}. The authors of the code base are
115
            both from the academia and from the open source community (several contributors are employed
115
            both from the academia and from the open source community (several contributors are employed
116
            as Solaris kernel developers and many are freelance IT professionals).
116
            as Solaris kernel developers and many are freelance IT professionals).
117
           
117
           
118
            HelenOS uses a preemptive priority-feedback scheduler, it supports SMP hardware and it is
118
            HelenOS uses a preemptive priority-feedback scheduler, it supports SMP hardware and it is
119
            designed to be highly portable. Currently it runs on 7 distinct hardware architectures, including the
119
            designed to be highly portable. Currently it runs on 7 distinct hardware architectures, including the
120
            most common IA-32, x86-64 (AMD64), IA-64, SPARC~v9 and PowerPC. It also runs on ARMv7 and MIPS,
120
            most common IA-32, x86-64 (AMD64), IA-64, SPARC~v9 and PowerPC. It also runs on ARMv7 and MIPS,
121
            but currently only in simulators and not on physical hardware.
121
            but currently only in simulators and not on physical hardware.
122
           
122
           
123
            Although HelenOS is still far from being an everyday replacement for Linux or Windows due to the lack
123
            Although HelenOS is still far from being an everyday replacement for Linux or Windows due to the lack
124
            of end-user applications (whose development is extremely time-consuming, but unfortunately of
124
            of end-user applications (whose development is extremely time-consuming, but unfortunately of
125
            no scientific value), the essential foundations such as file system support and TCP/IP networking
125
            no scientific value), the essential foundations such as file system support and TCP/IP networking
126
            are already in place.
126
            are already in place.
127
           
127
           
128
            HelenOS does not currently target embedded devices (although the ARMv7 port can be very easily
128
            HelenOS does not currently target embedded devices (although the ARMv7 port can be very easily
129
            modified to run on various embedded boards) and does not implement real-time features.
129
            modified to run on various embedded boards) and does not implement real-time features.
130
            Many development projects such as task snapshoting and migration, support for MMU-less
130
            Many development projects such as task snapshoting and migration, support for MMU-less
131
            platforms and performance monitoring are currently underway.
131
            platforms and performance monitoring are currently underway.
132
           
132
           
133
            \medskip
133
            \medskip
134
           
134
           
135
            HelenOS can be briefly described as microkernel multiserver design. However, the actual design
135
            HelenOS can be briefly described as microkernel multiserver design. However, the actual design
136
            guiding principles of the HelenOS are more elaborate:
136
            guiding principles of the HelenOS are more elaborate:
137
           
137
           
138
            \begin{description}
138
            \begin{description}
139
                \item[Microkernel principle] Every functionality of the OS that does not
139
                \item[Microkernel principle] Every functionality of the OS that does not
140
                      have to be necessary implemented in the kernel should be implemented in user space. This
140
                      have to be necessary implemented in the kernel should be implemented in user space. This
141
                      implies that subsystems such as the file system, device drivers (except those which are
141
                      implies that subsystems such as the file system, device drivers (except those which are
142
                      essential for the basic kernel functionality), naming and trading services, networking,
142
                      essential for the basic kernel functionality), naming and trading services, networking,
143
                      human interface and similar features should be implemented in user space.
143
                      human interface and similar features should be implemented in user space.
144
                \item[Full-fledged principle] Features which need to be placed in kernel should
144
                \item[Full-fledged principle] Features which need to be placed in kernel should
145
                      be implemented by full-fledged algorithms and data structures. In contrast
145
                      be implemented by full-fledged algorithms and data structures. In contrast
146
                      to several other microkernel OSes, where the authors have deliberately chosen
146
                      to several other microkernel OSes, where the authors have deliberately chosen
147
                      the most simplistic approach (static memory allocation, na\"{\i}ve algorithms, simple data
147
                      the most simplistic approach (static memory allocation, na\"{\i}ve algorithms, simple data
148
                      structures), HelenOS microkernel tries to use the most advanced and suitable means available.
148
                      structures), HelenOS microkernel tries to use the most advanced and suitable means available.
149
                      It contains features such as AVL and B+ trees, hashing tables, SLAB memory allocator, multiple
149
                      It contains features such as AVL and B+ trees, hashing tables, SLAB memory allocator, multiple
150
                      in-kernel synchronization primitives, fine-grained locking and so on.
150
                      in-kernel synchronization primitives, fine-grained locking and so on.
151
                \item[Multiserver principle] Subsystems in user space should be decomposed with the smallest
151
                \item[Multiserver principle] Subsystems in user space should be decomposed with the smallest
152
                      reasonable granularity. Each unit of decomposition should be encapsulated in a separate task.
152
                      reasonable granularity. Each unit of decomposition should be encapsulated in a separate task.
153
                      The tasks represent software components with isolated address spaces. From the design point of
153
                      The tasks represent software components with isolated address spaces. From the design point of
154
                      view the kernel can be seen as a separate component, too.
154
                      view the kernel can be seen as a separate component, too.
155
                \item[Split of mechanism and policy] The kernel should only provide low-level mechanisms,
155
                \item[Split of mechanism and policy] The kernel should only provide low-level mechanisms,
156
                      while the high-level policies which are built upon these mechanisms should be defined in
156
                      while the high-level policies which are built upon these mechanisms should be defined in
157
                      user space. This also implies that the policies should be easily replaceable while keeping
157
                      user space. This also implies that the policies should be easily replaceable while keeping
158
                      the low-level mechanisms intact.
158
                      the low-level mechanisms intact.
159
                \item[Encapsulation principle] The communication between the tasks/components should be
159
                \item[Encapsulation principle] The communication between the tasks/components should be
160
                      implemented only via a set of well-defined interfaces. In the user-to-user case the preferred
160
                      implemented only via a set of well-defined interfaces. In the user-to-user case the preferred
161
                      communication mechanism is HelenOS IPC, which provides reasonable mix of abstraction and
161
                      communication mechanism is HelenOS IPC, which provides reasonable mix of abstraction and
162
                      performance (RPC-like primitives combined with implicit memory sharing for large data
162
                      performance (RPC-like primitives combined with implicit memory sharing for large data
163
                      transfers). In case of synchronous user-to-kernel communication the usual syscalls are used.
163
                      transfers). In case of synchronous user-to-kernel communication the usual syscalls are used.
164
                      HelenOS IPC is used again for asynchronous kernel-to-user communication.
164
                      HelenOS IPC is used again for asynchronous kernel-to-user communication.
165
                \item[Portability principle] The design and implementation should always maintain a high
165
                \item[Portability principle] The design and implementation should always maintain a high
166
                      level of platform neutrality and portability. Platform-specific code in the kernel, core
166
                      level of platform neutrality and portability. Platform-specific code in the kernel, core
167
                      libraries and tasks implementing device drivers should be clearly separated from the
167
                      libraries and tasks implementing device drivers should be clearly separated from the
168
                      generic code (either by component decomposition or at least by internal compile-time APIs).
168
                      generic code (either by component decomposition or at least by internal compile-time APIs).
169
            \end{description}
169
            \end{description}
170
           
170
           
171
            In Section \ref{analysis} we argue that several of these design principles significantly improve
171
            In Section \ref{analysis} we argue that several of these design principles significantly improve
172
            the feasibility of formal verification of the entire system. On the other hand, other design principles
172
            the feasibility of formal verification of the entire system. On the other hand, other design principles
173
            induce new interesting challenges for formal description and verification.
173
            induce new interesting challenges for formal description and verification.
174
           
174
           
175
            The run-time architecture of HelenOS is inherently dynamic. The bindings between the components are
175
            The run-time architecture of HelenOS is inherently dynamic. The bindings between the components are
176
            not created at compile-time, but during bootstrap and can be modified to a large degree also during
176
            not created at compile-time, but during bootstrap and can be modified to a large degree also during
177
            normal operation mode of the system (via human interaction and external events).
177
            normal operation mode of the system (via human interaction and external events).
178
           
178
           
179
            The design of the ubiquitous HelenOS IPC mechanism and the associated threading model present
179
            The design of the ubiquitous HelenOS IPC mechanism and the associated threading model present
180
            the possibility to significantly reduce the size of the state space which needs to be explored
180
            the possibility to significantly reduce the size of the state space which needs to be explored
181
            by formal verification tools, but at the same time it is quite hard to express these
181
            by formal verification tools, but at the same time it is quite hard to express these
182
            constrains in many formalisms. The IPC is inherently asynchronous with constant message buffers
182
            constrains in many formalisms. The IPC is inherently asynchronous with constant message buffers
183
            in the kernel and dynamic buffers in user space. It uses the notions of uni-directional bindings,
183
            in the kernel and dynamic buffers in user space. It uses the notions of uni-directional bindings,
184
            mandatory pairing of requests and replies, binding establishment and abolishment handshakes,
184
            mandatory pairing of requests and replies, binding establishment and abolishment handshakes,
185
            memory sharing and fast message forwarding.
185
            memory sharing and fast message forwarding.
186
           
186
           
187
            For easier management of the asynchronous messages and the possibility to process multiple
187
            For easier management of the asynchronous messages and the possibility to process multiple
188
            messages from different peers without the usual kernel threading overhead, the core user space
188
            messages from different peers without the usual kernel threading overhead, the core user space
189
            library manages the execution flow by so-called \emph{fibrils}. A fibril is a user-space-managed
189
            library manages the execution flow by so-called \emph{fibrils}. A fibril is a user-space-managed
190
            thread with cooperative scheduling. A different fibril is scheduled every time the current fibril
190
            thread with cooperative scheduling. A different fibril is scheduled every time the current fibril
191
            is about to be blocked while sending out IPC requests (because the kernel buffers of the addressee
191
            is about to be blocked while sending out IPC requests (because the kernel buffers of the addressee
192
            are full) or while waiting on an IPC reply. This allows different execution flows within the
192
            are full) or while waiting on an IPC reply. This allows different execution flows within the
193
            same thread to process multiple requests and replies. To safeguard proper sequencing of IPC
193
            same thread to process multiple requests and replies. To safeguard proper sequencing of IPC
194
            messages and provide synchronization, special fibril-aware synchronization primitives
194
            messages and provide synchronization, special fibril-aware synchronization primitives
195
            (mutexes, condition variables, etc.) are available.
195
            (mutexes, condition variables, etc.) are available.
196
           
196
           
197
            Because of the cooperative nature of fibrils, they might cause severe performance under-utilization
197
            Because of the cooperative nature of fibrils, they might cause severe performance under-utilization
198
            in SMP configurations and system-wide bottlenecks. As multicore processors are more and more
198
            in SMP configurations and system-wide bottlenecks. As multicore processors are more and more
199
            common nowadays, that would be a substantial design flaw. Therefore the fibrils can be also freely
199
            common nowadays, that would be a substantial design flaw. Therefore the fibrils can be also freely
200
            (and to some degree even automatically) combined with the usual kernel threads, which provide
200
            (and to some degree even automatically) combined with the usual kernel threads, which provide
201
            preemptive scheduling and true parallelism on SMP machines. Needless to say, this combination is
201
            preemptive scheduling and true parallelism on SMP machines. Needless to say, this combination is
202
            also a grand challenge for the formal reasoning.
202
            also a grand challenge for the formal reasoning.
203
           
203
           
204
            \medskip
204
            \medskip
205
           
205
           
206
            Incidentally, the \emph{full-fledged principle} causes that the size of the HelenOS microkernel is
206
            Incidentally, the \emph{full-fledged principle} causes that the size of the HelenOS microkernel is
207
            considerably larger compared to other ``scrupulous'' microkernel implementations. The average
207
            considerably larger compared to other ``scrupulous'' microkernel implementations. The average
208
            footprint of the kernel on IA-32 ranges from 569~KiB when all logging messages, asserts, symbol
208
            footprint of the kernel on IA-32 ranges from 569~KiB when all logging messages, asserts, symbol
209
            resolution and the debugging kernel console are compiled in, down to 198~KiB for a non-debugging
209
            resolution and the debugging kernel console are compiled in, down to 198~KiB for a non-debugging
210
            production build. But we do not believe that the raw size of the microkernel is a relevant quality
210
            production build. But we do not believe that the raw size of the microkernel is a relevant quality
211
            criterion per se, without taking the actual feature set into account.
211
            criterion per se, without taking the actual feature set into account.
212
           
212
           
213
            \medskip
213
            \medskip
214
           
214
           
215
            To sum up, the choice of HelenOS as our case study is based on the fact that it was not designed
215
            To sum up, the choice of HelenOS as our case study is based on the fact that it was not designed
216
            in advance with formal verification in mind (some of the design principles might be beneficial,
216
            in advance with formal verification in mind (some of the design principles might be beneficial,
217
            but others might be disadvantageous), but the design of HelenOS is also non-trivial and not obsolete.
217
            but others might be disadvantageous), but the design of HelenOS is also non-trivial and not obsolete.
218
           
218
           
219
        \subsection{The C Programming Language}
219
        \subsection{The C Programming Language}
220
            A large majority of OSes is coded in the C programming language (HelenOS is no exception
220
            A large majority of OSes is coded in the C programming language (HelenOS is no exception
221
            to this). The choice of C in the case of kernel is usually well-motivated, since the C language was designed
221
            to this). The choice of C in the case of kernel is usually well-motivated, since the C language was designed
222
            specifically for implementing system software~\cite{c}: It is reasonably low-level in the sense that it allows
222
            specifically for implementing system software~\cite{c}: It is reasonably low-level in the sense that it allows
223
            to access the memory and other hardware resources with similar effectiveness as from assembler;
223
            to access the memory and other hardware resources with similar effectiveness as from assembler;
224
            It also requires almost no run-time support and it exports many features of the von Neumann hardware
224
            It also requires almost no run-time support and it exports many features of the von Neumann hardware
225
            architecture to the programmer in a very straightforward, but still relatively portable way.
225
            architecture to the programmer in a very straightforward, but still relatively portable way.
226
           
226
           
227
            However, what is the biggest advantage of C in terms of run-time performance is also the biggest weakness
227
            However, what is the biggest advantage of C in terms of run-time performance is also the biggest weakness
228
            for formal reasoning. The permissive memory access model of C, the lack of any reference safety
228
            for formal reasoning. The permissive memory access model of C, the lack of any reference safety
229
            enforcement, the weak type system and generally little semantic information in the code -- all these
229
            enforcement, the weak type system and generally little semantic information in the code -- all these
230
            properties that do not allow to make many general assumptions about the code.
230
            properties that do not allow to make many general assumptions about the code.
231
           
231
           
232
            Programming languages which target controlled environments such as Java or C\(\sharp\) are
232
            Programming languages which target controlled environments such as Java or C\(\sharp\) are
233
            generally easier for formal reasoning because they provide a well-known set of primitives
233
            generally easier for formal reasoning because they provide a well-known set of primitives
234
            and language constructs for object ownership, threading and synchronization. Many non-imperative
234
            and language constructs for object ownership, threading and synchronization. Many non-imperative
235
            programming languages can be even considered to be a form of ``executable specification'' and thus
235
            programming languages can be even considered to be a form of ``executable specification'' and thus
236
            very suitable for formal reasoning. In C, almost everything is left to the programmer who
236
            very suitable for formal reasoning. In C, almost everything is left to the programmer who
237
            is free to set the rules.
237
            is free to set the rules.
238
           
238
           
239
            \medskip
239
            \medskip
240
           
240
           
241
            The reasons for frequent use of C in the user space of many established OSes (and HelenOS) is
241
            The reasons for frequent use of C in the user space of many established OSes (and HelenOS) is
242
            probably much more questionable. In the case of HelenOS, except for the core libraries and tasks
242
            probably much more questionable. In the case of HelenOS, except for the core libraries and tasks
243
            (such as device drivers), C might be easily replaced by any high-level and perhaps even
243
            (such as device drivers), C might be easily replaced by any high-level and perhaps even
244
            non-imperative programming language. The reasons for using C in this context are mostly historical.
244
            non-imperative programming language. The reasons for using C in this context are mostly historical.
245
           
245
           
246
            However, as we have stated in Section \ref{introduction}, the way general-purpose OSes
246
            However, as we have stated in Section \ref{introduction}, the way general-purpose OSes
247
            are implemented changes only slowly and therefore any propositions which require radical modification
247
            are implemented changes only slowly and therefore any propositions which require radical modification
248
            of the existing code base before committing to the formal verification are not realistic.
248
            of the existing code base before committing to the formal verification are not realistic.
249
       
249
       
250
    \section{Analysis}
250
    \section{Analysis}
251
        \label{analysis}
251
        \label{analysis}
252
       
252
       
253
        \begin{figure}[t]
253
        \begin{figure}[t]
254
            \begin{center}
254
            \begin{center}
255
                \resizebox*{125mm}{!}{\includegraphics{diag}}
255
                \resizebox*{125mm}{!}{\includegraphics{diag}}
256
                \caption{Overview of the formalisms and tools proposed.}
256
                \caption{Overview of the formalisms and tools proposed.}
257
                \label{fig:diag}
257
                \label{fig:diag}
258
            \end{center}
258
            \end{center}
259
        \end{figure}
259
        \end{figure}
260
       
260
       
261
        In this section, we analyze the properties we would like to check in a general-purpose
261
        In this section, we analyze the properties we would like to check in a general-purpose
262
        operating system. Each set of properties usually requires a specific verification method,
262
        OS. Each set of properties usually requires a specific verification method, tool or toolchain.
263
        tool or toolchain.
-
 
264
       
263
       
265
        Our approach will be mostly bottom-up, or, in other words, from the lower levels of abstraction
264
        Our approach will be mostly bottom-up, or, in other words, from the lower levels of abstraction
266
        to the higher levels of abstraction. If the verification fails on a lower level, it usually
265
        to the higher levels of abstraction. If the verification fails on a lower level, it usually
267
        does not make much sense to continue with the higher levels of abstraction until the issues
266
        does not make much sense to continue with the higher levels of abstraction until the issues
268
        are tackled. A structured overview of the formalisms, methods and tools can be seen on
267
        are tackled. A structured overview of the formalisms, methods and tools can be seen on
269
        Figure \ref{fig:diag}.
268
        Figure \ref{fig:diag}.
270
       
269
       
271
        \medskip
270
        \medskip
272
       
271
       
273
        Some of the proposed methods cannot be called ``formal methods'' in the rigorous understanding
272
        Some of the proposed methods cannot be called ``formal methods'' in the rigorous understanding
274
        of the term. However, even methods which are based on semi-formal reasoning and non-exhaustive
273
        of the term. However, even methods which are based on semi-formal reasoning and non-exhaustive
275
        testing provide some limited guarantees in their specific context. A valued property
274
        testing provide some limited guarantees in their specific context. A valued property
276
        of the formal methods is to preserve these limited guarantees even on the higher levels
275
        of the formal methods is to preserve these limited guarantees even on the higher levels
277
        of abstraction, thus complementing the formal guarantees where the formal methods do not provide
276
        of abstraction, thus complementing the formal guarantees where the formal methods do not provide
278
        any feasible verification so far. This increases the coverage of the set of all hypothetical
277
        any feasible verification so far. This increases the coverage of the set of all hypothetical
279
        interesting properties of the system (although it is probably impossible to formally define
278
        interesting properties of the system (although it is probably impossible to formally define
280
        this set).
279
        this set).
281
       
280
       
282
        \medskip
281
        \medskip
283
       
282
       
284
        Please note that the titles of the following sections do not follow any particular established
283
        Please note that the titles of the following sections do not follow any particular established
285
        taxonomy. We have simply chosen the names to be intuitivelly descriptive.
284
        taxonomy. We have simply chosen the names to be intuitively descriptive.
286
       
285
       
287
        \subsection{C Language Compiler and Continuous Integration Tool}
286
        \subsection{C Language Compiler and Continuous Integration Tool}
288
            \label{clang}
287
            \label{clang}
289
            The initial levels of abstraction do not go far from the C source code and common engineering
288
            The initial levels of abstraction do not go far from the C source code and common engineering
290
            approaches. First, we would certainly like to know whether our code base is compliant with the
289
            approaches. First, we would certainly like to know whether our code base is compliant with the
291
            programming language specification and passes only the basic semantic checks (proper number
290
            programming language specification and passes only the basic semantic checks (proper number
292
            and types of arguments passed to functions, etc.). It is perhaps not very surprising that
291
            and types of arguments passed to functions, etc.). It is perhaps not very surprising that
293
            these decisions can be made by any plain C compiler. However, with the current implementation
292
            these decisions can be made by any plain C compiler. However, with the current implementation
294
            of HelenOS even this is not quite trivial.
293
            of HelenOS even this is not quite trivial.
295
           
294
           
296
            Besides the requirement to support 7 hardware platforms, the system's compile-time configuration
295
            Besides the requirement to support 7 hardware platforms, the system's compile-time configuration
297
            can be also affected by approximately 65 configuration options, most of which are booleans,
296
            can be also affected by approximately 65 configuration options, most of which are booleans,
298
            the rest are enumerated types.
297
            the rest are enumerated types.
299
           
298
           
300
            These configuration options are bound by logical propositions in conjunctive or disjunctive
299
            These configuration options are bound by logical propositions in conjunctive or disjunctive
301
            normal forms and while some options are freely configurable, the value of others gets inferred
300
            normal forms and while some options are freely configurable, the value of others gets inferred
302
            by the build system of HelenOS. The overall number of distinct configurations in which
301
            by the build system of HelenOS. The overall number of distinct configurations in which
303
            HelenOS can be compiled is at least one order of magnitude larger than the plain number
302
            HelenOS can be compiled is at least one order of magnitude larger than the plain number
304
            of supported hardware platforms.
303
            of supported hardware platforms.
305
           
304
           
306
            Various configuration options affect conditional compilation and linking. The programmers
305
            Various configuration options affect conditional compilation and linking. The programmers
307
            are used to make sure that the source code compiles and links fine with respect to the
306
            are used to make sure that the source code compiles and links fine with respect to the
308
            most common and obvious configurations, but the unforeseen interaction of the less common
307
            most common and obvious configurations, but the unforeseen interaction of the less common
309
            configuration options might cause linking or even compilation errors.
308
            configuration options might cause linking or even compilation errors.
310
           
309
           
311
            \medskip
310
            \medskip
312
           
311
           
313
            A straightforward solution is to generate all distinct configurations, starting from the
312
            A straightforward solution is to generate all distinct configurations, starting from the
314
            open variables and inferring the others. This can be part of the continuous integration
313
            open variables and inferring the others. This can be part of the continuous integration
315
            process which would try to compile and link the sources in all distinct configurations.
314
            process which would try to compile and link the sources in all distinct configurations.
316
           
315
           
317
            If we want to be really pedantic, we should also make sure that we run all higher
316
            If we want to be really pedantic, we should also make sure that we run all higher
318
            level verification methods on all configurations generated by this step. That would certainly
317
            level verification methods on all configurations generated by this step. That would certainly
319
            require to multiply the time required by the verification methods at least by the number
318
            require to multiply the time required by the verification methods at least by the number
320
            of the distinct configurations. Constraining the set of configurations to just the most
319
            of the distinct configurations. Constraining the set of configurations to just the most
321
            representative ones is perhaps a reasonable compromise to make the verification realistic.
320
            representative ones is perhaps a reasonable compromise to make the verification realistic.
322
       
321
       
323
        \subsection{Regression and Unit Tests}
322
        \subsection{Regression and Unit Tests}
324
            Running regression and unit tests which are part of HelenOS code base in the continuous
323
            Running regression and unit tests which are part of HelenOS code base in the continuous
325
            integration process is fairly easy. The only complication lies in the technicalities:
324
            integration process is fairly easy. The only complication lies in the technicalities:
326
            We need to setup an automated network of physical machines and simulators which can run the
325
            We need to setup an automated network of physical machines and simulators which can run the
327
            appropriate compilation outputs for the specific platforms. We need to be able to reboot
326
            appropriate compilation outputs for the specific platforms. We need to be able to reboot
328
            them remotely and distribute the boot images to them. And last but not least, we need to be
327
            them remotely and distribute the boot images to them. And last but not least, we need to be
329
            able to gather the results from them.
328
            able to gather the results from them.
330
           
329
           
331
            Testing is always non-exhaustive, thus the guarantees provided by tests are strictly limited
330
            Testing is always non-exhaustive, thus the guarantees provided by tests are strictly limited
332
            to the use cases and contexts which are being explicitly tested. However, it is arguably
331
            to the use cases and contexts which are being explicitly tested. However, it is arguably
333
            easier to express many common use cases in the primary programming language than in some
332
            easier to express many common use cases in the primary programming language than in some
334
            different formalism. As we follow the bottom-up approach, filtering out the most obvious
333
            different formalism. As we follow the bottom-up approach, filtering out the most obvious
335
            bugs by testing can save us a lot of valuable time which would be otherwise waisted by
334
            bugs by testing can save us a lot of valuable time which would be otherwise waisted by
336
            a futile verification by more formal (and more time-consuming) methods.
335
            a futile verification by more formal (and more time-consuming) methods.
337
       
336
       
338
        \subsection{Instrumentation}
337
        \subsection{Instrumentation}
339
            Instrumentation tools for detecting memory leaks, performance bottlenecks and soft-deadlocks
338
            Instrumentation tools for detecting memory leaks, performance bottlenecks and soft-deadlocks
340
            are also not usually considered to be formal verification tools (since it is hard to define
339
            are also not usually considered to be formal verification tools (since it is hard to define
341
            exact formal properties which are being verified by the non-exhaustive nature of these tools).
340
            exact formal properties which are being verified by the non-exhaustive nature of these tools).
342
            They are also rarely utilized on regular basis as part of the continuous integration process.
341
            They are also rarely utilized on regular basis as part of the continuous integration process.
343
            But again, it might be helpful to just mention them in the big picture.
342
            But again, it might be helpful to just mention them in the big picture.
344
           
343
           
345
            If some regression or unit tests fail, they sometimes do not give sufficient information to
344
            If some regression or unit tests fail, they sometimes do not give sufficient information to
346
            tell immediately what is the root cause of the issue. In that case running the faulting tests
345
            tell immediately what is the root cause of the issue. In that case running the faulting tests
347
            on manually or automatically instrumented executable code might provide more data and point
346
            on manually or automatically instrumented executable code might provide more data and point
348
            more directly to the actual bug.
347
            more directly to the actual bug.
349
       
348
       
350
        \subsection{Verifying C Language Compiler}
349
        \subsection{Verifying C Language Compiler}
351
            C language compilers are traditionally also not considered to be formal verification tools.
350
            C language compilers are traditionally also not considered to be formal verification tools.
352
            Many people just say that C compilers are good at generating executable code, but do not
351
            Many people just say that C compilers are good at generating executable code, but do not
353
            care much about the semantics of the source code (on the other hand, formal verification
352
            care much about the semantics of the source code (on the other hand, formal verification
354
            tools usually do not generate any executable code at all). However, with recent development
353
            tools usually do not generate any executable code at all). However, with recent development
355
            in the compiler domain, the old paradigms are shifting.
354
            in the compiler domain, the old paradigms are shifting.
356
           
355
           
357
            As the optimization passes and general maturity of the compilers improve over time,
356
            As the optimization passes and general maturity of the compilers improve over time,
358
            the compilers try to extract and use more and more semantic information from the source code.
357
            the compilers try to extract and use more and more semantic information from the source code.
359
            The C language is quite poor on explicit semantic information, but the verifying compilers
358
            The C language is quite poor on explicit semantic information, but the verifying compilers
360
            try to rely on vendor-specific language extensions and on the fact that some semantic information
359
            try to rely on vendor-specific language extensions and on the fact that some semantic information
361
            can be added to the source code without changing the resulting executable code.
360
            can be added to the source code without changing the resulting executable code.
362
           
361
           
363
            The checks done by the verifying compilers cannot result in fatal errors in the usual cases (they
362
            The checks done by the verifying compilers cannot result in fatal errors in the usual cases (they
364
            are just warnings). Firstly, the compilers still need to successfully compile a well-formed C source
363
            are just warnings). Firstly, the compilers still need to successfully compile a well-formed C source
365
            code compliant to some older standard (e.g. C89) even when it is not up with the current quality
364
            code compliant to some older standard (e.g. C89) even when it is not up with the current quality
366
            expectations. Old legacy source code should still pass the compilation as it did decades ago.
365
            expectations. Old legacy source code should still pass the compilation as it did decades ago.
367
           
366
           
368
            Secondly, the checks run by the verifying compilers are usually not based on abstract interpretation.
367
            Secondly, the checks run by the verifying compilers are usually not based on abstract interpretation.
369
            They are mostly realized as abstract syntax tree transformations much in the line with the supporting
368
            They are mostly realized as abstract syntax tree transformations much in the line with the supporting
370
            routines of the compilation process (data and control flow graph analysis, dead code elimination,
369
            routines of the compilation process (data and control flow graph analysis, dead code elimination,
371
            register allocation, etc.) and the evaluation function is basically the matching of antipatterns
370
            register allocation, etc.) and the evaluation function is basically the matching of antipatterns
372
            of common programming bugs.
371
            of common programming bugs.
373
           
372
           
374
            The checks are usually conservative. The verifying compilers identify code constructs which are suspicious,
373
            The checks are usually conservative. The verifying compilers identify code constructs which are suspicious,
375
            which might arise out of programmer's bad intuition and so on, but even these code snippets cannot be
374
            which might arise out of programmer's bad intuition and so on, but even these code snippets cannot be
376
            tagged as definitive bugs (since the programmer can be simply in a position where he/she really wants to
375
            tagged as definitive bugs (since the programmer can be simply in a position where he/she really wants to
377
            do something very strange, but nevertheless legitimate). It is upon the programmer
376
            do something very strange, but nevertheless legitimate). It is upon the programmer
378
            to examine the root cause of the compiler warning, tell whether it is really a bug or just a false
377
            to examine the root cause of the compiler warning, tell whether it is really a bug or just a false
379
            positive and fix the issue by either amending some additional semantic information (e.g. adding an
378
            positive and fix the issue by either amending some additional semantic information (e.g. adding an
380
            explicit typecast or a vendor-specific language extension) or rewriting the code.
379
            explicit typecast or a vendor-specific language extension) or rewriting the code.
381
           
380
           
382
            Although this level of abstraction is coarse-grained and conservative, it can be called semi-formal,
381
            Although this level of abstraction is coarse-grained and conservative, it can be called semi-formal,
383
            since the properties which are being verified can be actually defined quite exactly and they
382
            since the properties which are being verified can be actually defined quite exactly and they
384
            are reasonably general. They do not deal with single traces of methods, runs and use
383
            are reasonably general. They do not deal with single traces of methods, runs and use
385
            cases like tests, but they deal with all possible contexts in which the code can run.
384
            cases like tests, but they deal with all possible contexts in which the code can run.
386
       
385
       
387
        \subsection{Static Analyzer}
386
        \subsection{Static Analyzer}
388
            Static analyzers try to go deeper than verifying compilers. Besides detecting common antipatterns of
387
            Static analyzers try to go deeper than verifying compilers. Besides detecting common antipatterns of
389
            bugs, they also use techniques such as abstract interpretation to check for more complex properties.
388
            bugs, they also use techniques such as abstract interpretation to check for more complex properties.
390
           
389
           
391
            Most commercial static analyzers come with a predefined set of properties which cannot be easily changed.
390
            Most commercial static analyzers come with a predefined set of properties which cannot be easily changed.
392
            They are coupled with the commonly used semantics of the environment and generate domain-specific models
391
            They are coupled with the commonly used semantics of the environment and generate domain-specific models
393
            of the software based not only on the syntax of the source code, but also based on the assumptions derived
392
            of the software based not only on the syntax of the source code, but also based on the assumptions derived
394
            from the memory access model, allocation and deallocation rules, tracking of references and tracking of
393
            from the memory access model, allocation and deallocation rules, tracking of references and tracking of
395
            concurrency locks.
394
            concurrency locks.
396
           
395
           
397
            The biggest advantage of static analyzers is that they can be easily included in the development or
396
            The biggest advantage of static analyzers is that they can be easily included in the development or
398
            continuous integration process as an additional automated step, very similar to the verifying compilers.
397
            continuous integration process as an additional automated step, very similar to the verifying compilers.
399
            No manual definition of code-specific properties is needed and false positives can be relatively easily
398
            No manual definition of code-specific properties is needed and false positives can be relatively easily
400
            eliminated by amending some explicit additional information to the source code within the boundaries
399
            eliminated by amending some explicit additional information to the source code within the boundaries
401
            of the programming language.
400
            of the programming language.
402
           
401
           
403
            The authors of static analyzers claim large quantities of bugs detected or prevented~\cite{billion},
402
            The authors of static analyzers claim large quantities of bugs detected or prevented~\cite{billion},
404
            but static analyzers are still relatively limited by the kind of bugs they are designed to detect.
403
            but static analyzers are still relatively limited by the kind of bugs they are designed to detect.
405
            They are usually good at pointing out common issues with security implications (specific types of
404
            They are usually good at pointing out common issues with security implications (specific types of
406
            buffer and stack overruns, usage of well-known functions in an unsafe way, clear cases of forgotten
405
            buffer and stack overruns, usage of well-known functions in an unsafe way, clear cases of forgotten
407
            deallocation of resources and release of locks, etc.). Unfortunately, many static analyzers
406
            deallocation of resources and release of locks, etc.). Unfortunately, many static analyzers
408
            only analyze a single-threaded control flow and are thus unable to detect concurrency issues
407
            only analyze a single-threaded control flow and are thus unable to detect concurrency issues
409
            such as deadlocks.
408
            such as deadlocks.
410
       
409
       
411
        \subsection{Static Verifier}
410
        \subsection{Static Verifier}
412
            There is one key difference between a static analyzer and a static verifier: Static verifiers
411
            There is one key difference between a static analyzer and a static verifier: Static verifiers
413
            allow the user to specify one's own properties, in terms of preconditions, postconditions and
412
            allow the user to specify one's own properties, in terms of preconditions, postconditions and
414
            invariants in the code. Many static verifiers also target true multithreaded usage patterns
413
            invariants in the code. Many static verifiers also target true multithreaded usage patterns
415
            and have the capability to check proper locking order, hand-over-hand locking and even liveliness.
414
            and have the capability to check proper locking order, hand-over-hand locking and even liveliness.
416
           
415
           
417
            In the context of an operating system kernel and core libraries two kinds of properties are
416
            In the context of an OS kernel and core libraries two kinds of properties are common:
418
            common:
-
 
419
           
417
           
420
            \begin{description}
418
            \begin{description}
421
                \item[Consistency constrains] These properties define the correct way how data is supposed
419
                \item[Consistency constrains] These properties define the correct way how data is supposed
422
                      to be manipulated by some related set of subroutines. Checking for these
420
                      to be manipulated by some related set of subroutines. Checking for these
423
                      properties ensures that data structures and internal states will not get corrupt due
421
                      properties ensures that data structures and internal states will not get corrupt due
424
                      to bugs in the functions and methods which are designed to manipulate them.
422
                      to bugs in the functions and methods which are designed to manipulate them.
425
                \item[Interface enforcements] These properties define the correct semantics by which
423
                \item[Interface enforcements] These properties define the correct semantics by which
426
                      a set of subroutines should be used by the rest of the code. Checking for these properties
424
                      a set of subroutines should be used by the rest of the code. Checking for these properties
427
                      ensures that some API is always used by the rest of the code in a specified way
425
                      ensures that some API is always used by the rest of the code in a specified way
428
                      and all reported exceptions are handled by the client code.
426
                      and all reported exceptions are handled by the client code.
429
            \end{description}
427
            \end{description}
430
       
428
       
431
        \subsection{Model Checker}
429
        \subsection{Model Checker}
432
            \label{modelcheck}
430
            \label{modelcheck}
433
            On the first sight it does not seem to be reasonable to consider general model checkers as
431
            On the first sight it does not seem to be reasonable to consider general model checkers as
434
            relevant independent tools for formal verification of an existing operating system. While many
432
            relevant independent tools for formal verification of an existing OS. While many different
435
            different tools use model checkers as their backends, verifying a complete model of the entire
433
            tools use model checkers as their backends, verifying a complete model of the entire
436
            system created by hand seems to be infeasible both in the sense of time required for the model
434
            system created by hand seems to be infeasible both in the sense of time required for the model
437
            creation and resources required by the checker to finish the exhaustive traversal of the model's
435
            creation and resources required by the checker to finish the exhaustive traversal of the model's
438
            address space.
436
            address space.
439
           
437
           
440
            Nevertheless, model checkers on their own can still serve a good job verifying abstract
438
            Nevertheless, model checkers on their own can still serve a good job verifying abstract
441
            properties of key algorithms without dealing with the technical details of the implementation.
439
            properties of key algorithms without dealing with the technical details of the implementation.
442
            Various properties of synchronization algorithms, data structures and communication protocols
440
            Various properties of synchronization algorithms, data structures and communication protocols
443
            can be verified in the most generic conditions by model checkers, answering the
441
            can be verified in the most generic conditions by model checkers, answering the
444
            question whether they are designed properly in theory.
442
            question whether they are designed properly in theory.
445
           
443
           
446
            If the implementation of these algorithms and protocols do not behave correctly, we can be sure
444
            If the implementation of these algorithms and protocols do not behave correctly, we can be sure
447
            that the root cause is in the non-compliance between the design and implementation and not a
445
            that the root cause is in the non-compliance between the design and implementation and not a
448
            fundamental flaw of the design itself.
446
            fundamental flaw of the design itself.
449
       
447
       
450
        \subsection{Architecture and Behavior Checker}
448
        \subsection{Architecture and Behavior Checker}
451
            All previously mentioned verification methods were targeting internal properties of the OS
449
            All previously mentioned verification methods were targeting internal properties of the OS
452
            components. If we are moving to a higher-level abstraction in order to specify correct
450
            components. If we are moving to a higher-level abstraction in order to specify correct
453
            interaction of the encapsulated components in terms of interface compatibility and communication,
451
            interaction of the encapsulated components in terms of interface compatibility and communication,
454
            we can utilize \emph{Behavior Protocols}~\cite{bp} or some other formalism describing correct
452
            we can utilize \emph{Behavior Protocols}~\cite{bp} or some other formalism describing correct
455
            interaction between software components.
453
            interaction between software components.
456
           
454
           
457
            To gain the knowledge about the architecture of the whole OS in terms of software
455
            To gain the knowledge about the architecture of the whole OS in terms of software
458
            component composition and bindings, we can use \emph{Architecture Description Language}~\cite{adl}
456
            component composition and bindings, we can use \emph{Architecture Description Language}~\cite{adl}
459
            as the specification of the architecture of the system. This language has the possibility to capture
457
            as the specification of the architecture of the system. This language has the possibility to capture
460
            interface types (with method signatures), primitive components (in terms of provided and required
458
            interface types (with method signatures), primitive components (in terms of provided and required
461
            interfaces), composite components (an architectural compositions of primitive components) and the
459
            interfaces), composite components (an architectural compositions of primitive components) and the
462
            bindings between the respective interfaces of the components.
460
            bindings between the respective interfaces of the components.
463
           
461
           
464
            It is extremely important to define the right role of the behavior and architecture description.
462
            It is extremely important to define the right role of the behavior and architecture description.
465
            A flawed approach would be to reverse-engineer this description from the source code (either manually
463
            A flawed approach would be to reverse-engineer this description from the source code (either manually
466
            or via some sophisticated tool) and then verify the compliance between the desciption and
464
            or via some sophisticated tool) and then verify the compliance between the description and
467
            the implementation. However, different directions can give more interesting results:
465
            the implementation. However, different directions can give more interesting results:
468
           
466
           
469
            \begin{description}
467
            \begin{description}
470
                \item[Description as specification] Behavior and architecture description created independently
468
                \item[Description as specification] Behavior and architecture description created independently
471
                      on the source code serves the role of specification. This has the following primary
469
                      on the source code serves the role of specification. This has the following primary
472
                      goals of formal verification:
470
                      goals of formal verification:
473
                      \begin{description}
471
                      \begin{description}
474
                        \item[Horizontal compliance] Also called \emph{compatibility}. The goal is to check
472
                        \item[Horizontal compliance] Also called \emph{compatibility}. The goal is to check
475
                             whether the specifications of components that are bound together are semantically
473
                             whether the specifications of components that are bound together are semantically
476
                             compatible. All required interfaces need to be bound to provided interfaces and
474
                             compatible. All required interfaces need to be bound to provided interfaces and
477
                             the communication between the components cannot lead to \emph{no activity} (a deadlock),
475
                             the communication between the components cannot lead to \emph{no activity} (a deadlock),
478
                             \emph{bad activity} (a livelock) or other communication and synchronization errors.
476
                             \emph{bad activity} (a livelock) or other communication and synchronization errors.
479
                        \item[Vertical compliance] Also called \emph{substituability}. The goal is to check whether
477
                        \item[Vertical compliance] Also called \emph{substituability}. The goal is to check whether
480
                             it is possible to replace a set of primitive components that are nested inside a composite
478
                             it is possible to replace a set of primitive components that are nested inside a composite
481
                             component by the composite component itself. In other words, this compliance can answer the
479
                             component by the composite component itself. In other words, this compliance can answer the
482
                             questions whether the architecture description of the system is sound with respect to the hierarchical
480
                             questions whether the architecture description of the system is sound with respect to the hierarchical
483
                             composition of the components.
481
                             composition of the components.
484
                        \item[Compliance between the specification and the implementation] Using various means
482
                        \item[Compliance between the specification and the implementation] Using various means
485
                             for generating artificial environments for an isolated component the checker is able to
483
                             for generating artificial environments for an isolated component the checker is able to
486
                             partially answer the question whether the implementation of the component is an instantiation
484
                             partially answer the question whether the implementation of the component is an instantiation
487
                             of the component specification.
485
                             of the component specification.
488
                      \end{description}
486
                      \end{description}
489
                \item[Description as abstraction] Generating the behavior and architecture description from the
487
                \item[Description as abstraction] Generating the behavior and architecture description from the
490
                      source code by means of abstract interpretation can serve the purpose of verifying various
488
                      source code by means of abstract interpretation can serve the purpose of verifying various
491
                      properties of the implementation such as invariants, preconditions and postconditions.
489
                      properties of the implementation such as invariants, preconditions and postconditions.
492
                      This is similar to static verification, but on the level of component interfaces.
490
                      This is similar to static verification, but on the level of component interfaces.
493
            \end{description}
491
            \end{description}
494
           
492
           
495
            Unfortunately, most of the behavior and architecture formalisms are static, which is not quite suitable
493
            Unfortunately, most of the behavior and architecture formalisms are static, which is not quite suitable
496
            for the dynamic of most OSes. This limitation can be circumvented by considering a relevant
494
            for the dynamic of most OSes. This limitation can be circumvented by considering a relevant
497
            snapshot of the dynamic run-time architecture. This snapshot fixed in time is equivalent to
495
            snapshot of the dynamic run-time architecture. This snapshot fixed in time is equivalent to
498
            a statically defined architecture.
496
            a statically defined architecture.
499
           
497
           
500
            \medskip
498
            \medskip
501
           
499
           
502
            The key features of software systems with respect to behavior and architecture checkers are the granularity
500
            The key features of software systems with respect to behavior and architecture checkers are the granularity
503
            of the individual primitive components, the level of isolation and complexity of the communication mechanism
501
            of the individual primitive components, the level of isolation and complexity of the communication mechanism
504
            between them. Large monolithic OSes created in sematic-poor C present a severe challenge because the
502
            between them. Large monolithic OSes created in semantic-poor C present a severe challenge because the
505
            isolation of the individual components is vague and the communication between them is basically unlimited
503
            isolation of the individual components is vague and the communication between them is basically unlimited
506
            (function calls, shared resources, etc.).
504
            (function calls, shared resources, etc.).
507
           
505
           
508
            OSes with explicit component architecture and fine-grained components (such as microkernel multiserver
506
            OSes with explicit component architecture and fine-grained components (such as microkernel multiserver
509
            systems) make the feasibility of the verification much easier, since the degrees of freedom (and thus
507
            systems) make the feasibility of the verification much easier, since the degrees of freedom (and thus
510
            the state space) is limited.
508
            the state space) is limited.
511
           
509
           
512
            Horizontal and vertical compliance checking can be done exhaustively. This is a fundamental property
510
            Horizontal and vertical compliance checking can be done exhaustively. This is a fundamental property
513
            which allows the reasoning about the dependability of the entire component-based OS.
511
            which allows the reasoning about the dependability of the entire component-based OS.
514
            Assuming that the lower-level verification methods (described in Sections \ref{clang} to \ref{modelcheck})
512
            Assuming that the lower-level verification methods (described in Sections \ref{clang} to \ref{modelcheck})
515
            prove some specific properties of the primitive components, we can be sure that the composition of
513
            prove some specific properties of the primitive components, we can be sure that the composition of
516
            the primitive components into composite components and ultimately into the whole OS
514
            the primitive components into composite components and ultimately into the whole OS
517
            does not break these properties.
515
            does not break these properties.
518
           
516
           
519
            The feasibility of many lower-level verification methods from Sections \ref{clang} to \ref{modelcheck}
517
            The feasibility of many lower-level verification methods from Sections \ref{clang} to \ref{modelcheck}
520
            depends largely on the size and complexity of the code under verification. If the entire OS
518
            depends largely on the size and complexity of the code under verification. If the entire OS
521
            is decomposed into primitive components with a fine granularity, it is more likely that the
519
            is decomposed into primitive components with a fine granularity, it is more likely that the
522
            individual primitive components can be verified against a large number of properties. Thanks to the
520
            individual primitive components can be verified against a large number of properties. Thanks to the
523
            recursive component composition we can then be sure that these properties also hold for the entire system.
521
            recursive component composition we can then be sure that these properties also hold for the entire system.
524
           
522
           
525
            \medskip
523
            \medskip
526
           
524
           
527
            The compliance between the behavior specification and the actual behavior of the implementation is, unfortunately,
525
            The compliance between the behavior specification and the actual behavior of the implementation is, unfortunately,
528
            the missing link in the chain. This compliance cannot be easily verified in an exhaustive manner. If there is
526
            the missing link in the chain. This compliance cannot be easily verified in an exhaustive manner. If there is
529
            a discrepancy between the specified and the actual behavior of the components, we cannot conclude anything about
527
            a discrepancy between the specified and the actual behavior of the components, we cannot conclude anything about
530
            the properties holding in the entire system.
528
            the properties holding in the entire system.
531
           
529
           
532
            However, there is one way how to improve the situation: \emph{code generation}. If we generate implementation
530
            However, there is one way how to improve the situation: \emph{code generation}. If we generate implementation
533
            from the specification, the compliance between them is axiomatic. If we are able to generate enough
531
            from the specification, the compliance between them is axiomatic. If we are able to generate enough
534
            code from the specification to run into the hand-written ``business code'' where we check for
532
            code from the specification to run into the hand-written ``business code'' where we check for
535
            the lower-level properties, the conclusions about the component composition are going to hold.
533
            the lower-level properties, the conclusions about the component composition are going to hold.
536
       
534
       
537
        \subsection{Behavior Description Generator}
535
        \subsection{Behavior Description Generator}
538
            To conclude our path towards higher abstractions we can utilize tools that can
536
            To conclude our path towards higher abstractions we can utilize tools that can
539
            generate the behavior descriptions from \emph{textual use cases} written in a domain-constrained English.
537
            generate the behavior descriptions from \emph{textual use cases} written in a domain-constrained English.
540
            These generated artifacts can be then compared (e.g. via vertical compliance checking) with the formal
538
            These generated artifacts can be then compared (e.g. via vertical compliance checking) with the formal
541
            specification. Although the comparison might not provide clean-cut results, it can still be
539
            specification. Although the comparison might not provide clean-cut results, it can still be
542
            helpful to confront the more-or-less informal user expectations on the system with the exact formal description.
540
            helpful to confront the more-or-less informal user expectations on the system with the exact formal description.
543
       
541
       
544
        \subsection{Summary}
542
        \subsection{Summary}
545
            \label{missing}
543
            \label{missing}
546
            So far, we have proposed a compact combination of engineering, semi-formal and formal methods which
544
            So far, we have proposed a compact combination of engineering, semi-formal and formal methods which
547
            start at the level of C programming language, offer the possibility to check for the presence of various
545
            start at the level of C programming language, offer the possibility to check for the presence of various
548
            common antipatterns, check for generic algorithm-related properties, consistency constrains, interface
546
            common antipatterns, check for generic algorithm-related properties, consistency constrains, interface
549
            enforcements and conclude with a framework to make these properties hold even in the case of a large
547
            enforcements and conclude with a framework to make these properties hold even in the case of a large
550
            OS composed from many components of compliant behavior.
548
            OS composed from many components of compliant behavior.
551
           
549
           
552
            We do not claim that there are no missing pieces in the big picture or that the semi-formal verifications
550
            We do not claim that there are no missing pieces in the big picture or that the semi-formal verifications
553
            might provide more guarantees in this setup. However, state-of-the-art OS design guidelines can push
551
            might provide more guarantees in this setup. However, state-of-the-art OS design guidelines can push
554
            further the boundaries of practical feasibility of the presented methods. The limited guarantees
552
            further the boundaries of practical feasibility of the presented methods. The limited guarantees
555
            of the low-level methods hold even in the composition and the high-level formal methods do not have
553
            of the low-level methods hold even in the composition and the high-level formal methods do not have
556
            to deal with unlimited degrees of freedom of the primitive component implementation.
554
            to deal with unlimited degrees of freedom of the primitive component implementation.
557
           
555
           
558
            \medskip
556
            \medskip
559
           
557
           
560
            We have spoken only about the functional properties. In general, we cannot apply the same formalisms
558
            We have spoken only about the functional properties. In general, we cannot apply the same formalisms
561
            and methods on extra-functional properties (e.g. timing properties, performance properties, etc.).
559
            and methods on extra-functional properties (e.g. timing properties, performance properties, etc.).
562
            And although it probably does make a good sense to reason about component composition for the extra-functional
560
            And although it probably does make a good sense to reason about component composition for the extra-functional
563
            properties, the exact relation might be different compared to the functional properties.
561
            properties, the exact relation might be different compared to the functional properties.
564
           
562
           
565
            The extra-functional properties need to be tackled by our future work.
563
            The extra-functional properties need to be tackled by our future work.
566
       
564
       
567
    \section{Evaluation}
565
    \section{Evaluation}
568
        \label{evaluation}
566
        \label{evaluation}
569
        This section copies the structure of the previous Section \ref{analysis} and adds HelenOS-specific
567
        This section copies the structure of the previous Section \ref{analysis} and adds HelenOS-specific
570
        evaluation of the the proposed formalisms and tools. As this is still largely a work-in-progress,
568
        evaluation of the the proposed formalisms and tools. As this is still largely a work-in-progress,
571
        in many cases just the initial observations can be made.
569
        in many cases just the initial observations can be made.
572
       
570
       
573
        The choice of the specific methods, tools and formalisms in this initial phase is mostly motivated
571
        The choice of the specific methods, tools and formalisms in this initial phase is mostly motivated
574
        by their perceived commonality and author's claims about fitness for the given purpose. An important
572
        by their perceived commonality and author's claims about fitness for the given purpose. An important
575
        part of further evaluation would certainly be to compare multiple particular approaches, tools
573
        part of further evaluation would certainly be to compare multiple particular approaches, tools
576
        and formalisms to find the optimal combination.
574
        and formalisms to find the optimal combination.
577
       
575
       
578
        \subsection{Verifying C Language Compiler and Continuous Integration Tool}
576
        \subsection{Verifying C Language Compiler and Continuous Integration Tool}
579
            The primary C compiler used by HelenOS is \emph{GNU GCC 4.4.3} (all platforms)~\cite{gcc} and \emph{Clang 2.6.0}
577
            The primary C compiler used by HelenOS is \emph{GNU GCC 4.4.3} (all platforms)~\cite{gcc} and \emph{Clang 2.6.0}
580
            (IA-32)~\cite{clang}. We have taken some effort to support also \emph{ICC} and \emph{Sun Studio} C compilers,
578
            (IA-32)~\cite{clang}. We have taken some effort to support also \emph{ICC} and \emph{Sun Studio} C compilers,
581
            but the compatibility with these compilers in not guaranteed.
579
            but the compatibility with these compilers in not guaranteed.
582
           
580
           
583
            The whole code base is compiled with the \texttt{-Wall} and \texttt{-Wextra} compilation options. These options turn on
581
            The whole code base is compiled with the \texttt{-Wall} and \texttt{-Wextra} compilation options. These options turn on
584
            most of the verification checks of the compilers. The compilers trip on common bug antipatterns such
582
            most of the verification checks of the compilers. The compilers trip on common bug antipatterns such
585
            as implicit typecasting of pointer types, comparison of signed and unsigned integer values (danger
583
            as implicit typecasting of pointer types, comparison of signed and unsigned integer values (danger
586
            of unchecked overflows), the usage of uninitialized variables, the presence of unused local variables,
584
            of unchecked overflows), the usage of uninitialized variables, the presence of unused local variables,
587
            NULL-pointer dereferencing (determined by conservative local control flow analysis), functions
585
            NULL-pointer dereferencing (determined by conservative local control flow analysis), functions
588
            with non-void return typed that do not return any value and so on. We treat all compilation warnings
586
            with non-void return typed that do not return any value and so on. We treat all compilation warnings
589
            as fatal errors, thus the code base must pass without any warnings.
587
            as fatal errors, thus the code base must pass without any warnings.
590
           
588
           
591
            We also turn on several more specific and strict checks. These checks helped to discover several
589
            We also turn on several more specific and strict checks. These checks helped to discover several
592
            latent bugs in the source code:
590
            latent bugs in the source code:
593
           
591
           
594
            \begin{description}
592
            \begin{description}
595
                \item[\texttt{-Wfloat-equal}] Check for exact equality comparison between floating point values. The
593
                \item[\texttt{-Wfloat-equal}] Check for exact equality comparison between floating point values. The
596
                      usage of equal comparator on floats is usually misguided due to the inherent computational errors
594
                      usage of equal comparator on floats is usually misguided due to the inherent computational errors
597
                      of floats.
595
                      of floats.
598
                \item[\texttt{-Wcast-align}] Check for code which casts pointers to a type with a stricter alignment
596
                \item[\texttt{-Wcast-align}] Check for code which casts pointers to a type with a stricter alignment
599
                      requirement. On many RISC-based platforms this can cause run-time unaligned access exceptions.
597
                      requirement. On many RISC-based platforms this can cause run-time unaligned access exceptions.
600
                \item[\texttt{-Wconversion}] Check for code where the implicit type conversion (e.g. from float to integer,
598
                \item[\texttt{-Wconversion}] Check for code where the implicit type conversion (e.g. from float to integer,
601
                      between signed and unsigned integers or between integers with different number of bits) can
599
                      between signed and unsigned integers or between integers with different number of bits) can
602
                      cause the actual value to change.
600
                      cause the actual value to change.
603
            \end{description}
601
            \end{description}
604
           
602
           
605
            To enhance the semantic information in the source code, we use GCC-specific language extensions to annotate
603
            To enhance the semantic information in the source code, we use GCC-specific language extensions to annotate
606
            some particular kernel and core library routines:
604
            some particular kernel and core library routines:
607
           
605
           
608
            \begin{description}
606
            \begin{description}
609
                \item[\texttt{\_\_attribute\_\_((noreturn))}] Functions marked in this way never finish from the point of view
607
                \item[\texttt{\_\_attribute\_\_((noreturn))}] Functions marked in this way never finish from the point of view
610
                      of the current sequential execution flow. The most common case are the routines which restore previously saved
608
                      of the current sequential execution flow. The most common case are the routines which restore previously saved
611
                      execution context.
609
                      execution context.
612
                \item[\texttt{\_\_attribute\_\_((returns\_twice))}] Functions marked in this way may return multiple times from
610
                \item[\texttt{\_\_attribute\_\_((returns\_twice))}] Functions marked in this way may return multiple times from
613
                      the point of view of the current sequential execution flow. This is the case of routines which save the current
611
                      the point of view of the current sequential execution flow. This is the case of routines which save the current
614
                      execution context (first the function returns as usual, but the function can eventually ``return again''
612
                      execution context (first the function returns as usual, but the function can eventually ``return again''
615
                      when the context is being restored).
613
                      when the context is being restored).
616
            \end{description}
614
            \end{description}
617
           
615
           
618
            The use of these extensions has pointed out to several hard-to-debug bugs on the IA-64 platform.
616
            The use of these extensions has pointed out to several hard-to-debug bugs on the IA-64 platform.
619
           
617
           
620
            \medskip
618
            \medskip
621
           
619
           
622
            The automated continuous integration building system is currently work-in-progress. Thus, we do not
620
            The automated continuous integration building system is currently work-in-progress. Thus, we do not
623
            test all possible configurations of HelenOS with each changeset yet. Currently only
621
            test all possible configurations of HelenOS with each changeset yet. Currently only
624
            a representative set of 14 configurations (at least one for each supported platform) is tested by hand
622
            a representative set of 14 configurations (at least one for each supported platform) is tested by hand
625
            by the developers before committing any non-trivial changeset.
623
            by the developers before committing any non-trivial changeset.
626
           
624
           
627
            From occasional tests of other configurations by hand and the frequency of compilation, linkage and
625
            From occasional tests of other configurations by hand and the frequency of compilation, linkage and
628
            even run-time problems we conclude that the automated testing of all feasible configurations will
626
            even run-time problems we conclude that the automated testing of all feasible configurations will
629
            be very beneficial.
627
            be very beneficial.
630
           
628
           
631
        \subsection{Regression and Unit Tests}
629
        \subsection{Regression and Unit Tests}
632
            As already stated in the previous section, the continuous integration building system has not been finished
630
            As already stated in the previous section, the continuous integration building system has not been finished
633
            yet. Therefore regression and unit tests are executed occasionally by hand, which is time consuming
631
            yet. Therefore regression and unit tests are executed occasionally by hand, which is time consuming
634
            and prone to human omissions. An automated approach is definitively going to be very helpful.
632
            and prone to human omissions. An automated approach is definitively going to be very helpful.
635
       
633
       
636
        \subsection{Instrumentation}
634
        \subsection{Instrumentation}
637
            We are in the process of implementing our own code instrumentation framework which is motivated mainly
635
            We are in the process of implementing our own code instrumentation framework which is motivated mainly
638
            by the need to support MMU-less architectures in the future. But this framework might be also very helpful
636
            by the need to support MMU-less architectures in the future. But this framework might be also very helpful
639
            in detecting memory and generic resource leaks. We have not tried \emph{Valgrind}~\cite{valgrind} or any similar
637
            in detecting memory and generic resource leaks. We have not tried \emph{Valgrind}~\cite{valgrind} or any similar
640
            existing tool because of the estimated complexity to adopt it for the usage in HelenOS.
638
            existing tool because of the estimated complexity to adopt it for the usage in HelenOS.
641
           
639
           
642
            HelenOS was also scanned by \emph{Coverity}~\cite{coverity} in 2006 when no errors were detected. However, since
640
            HelenOS was also scanned by \emph{Coverity}~\cite{coverity} in 2006 when no errors were detected. However, since
643
            that time the code base has not been analyzed by Coverity.
641
            that time the code base has not been analyzed by Coverity.
644
       
642
       
645
        \subsection{Static Analyzer}
643
        \subsection{Static Analyzer}
646
            The integration of various static analyzers into the HelenOS continuous integration process is underway.
644
            The integration of various static analyzers into the HelenOS continuous integration process is underway.
647
            For the initial evaluation we have used \emph{Stanse}~\cite{stanse} and \emph{Clang Analyzer}~\cite{clanganalyzer}.
645
            For the initial evaluation we have used \emph{Stanse}~\cite{stanse} and \emph{Clang Analyzer}~\cite{clanganalyzer}.
648
            Both of them showed to be moderately helpful to point out instances of unreachable dead code, use of language
646
            Both of them showed to be moderately helpful to point out instances of unreachable dead code, use of language
649
            constructs which have ambiguous semantics in C and one case of possible NULL-pointer dereference.
647
            constructs which have ambiguous semantics in C and one case of possible NULL-pointer dereference.
650
           
648
           
651
            The open framework of Clang seems to be very promising for implementing domain-specific checks (and at
649
            The open framework of Clang seems to be very promising for implementing domain-specific checks (and at
652
            the same time it is also a very promising compiler framework). Our mid-term goal is to incorporate some of the features
650
            the same time it is also a very promising compiler framework). Our mid-term goal is to incorporate some of the features
653
            of Stanse and VCC (see Section \ref{staticverifier2}) into Clang Analyzer.
651
            of Stanse and VCC (see Section \ref{staticverifier2}) into Clang Analyzer.
654
       
652
       
655
        \subsection{Static Verifier}
653
        \subsection{Static Verifier}
656
            \label{staticverifier2}
654
            \label{staticverifier2}
657
            We have started to extend the source code of HelenOS kernel with annotations understood
655
            We have started to extend the source code of HelenOS kernel with annotations understood
658
            by \emph{Frama-C}~\cite{framac} and \emph{VCC}~\cite{vcc}. Initially we have targeted simple kernel data structures
656
            by \emph{Frama-C}~\cite{framac} and \emph{VCC}~\cite{vcc}. Initially we have targeted simple kernel data structures
659
            (doubly-linked circular lists) and basic locking operations. Currently we are evaluating the initial experiences
657
            (doubly-linked circular lists) and basic locking operations. Currently we are evaluating the initial experiences
660
            and we are trying to identify the most suitable methodology, but we expect quite promising results.
658
            and we are trying to identify the most suitable methodology, but we expect quite promising results.
661
           
659
           
662
            As the VCC is based on the Microsoft C++ Compiler, which does not support many GCC extensions, we have been
660
            As the VCC is based on the Microsoft C++ Compiler, which does not support many GCC extensions, we have been
663
            faced with the requirement to preprocess the source code to be syntactically accepted by VCC. This turned out
661
            faced with the requirement to preprocess the source code to be syntactically accepted by VCC. This turned out
664
            to be feasible.
662
            to be feasible.
665
       
663
       
666
        \subsection{Model Checker}
664
        \subsection{Model Checker}
667
            We are in the process of creating models of kernel wait queues (basic HelenOS kernel synchronization
665
            We are in the process of creating models of kernel wait queues (basic HelenOS kernel synchronization
668
            primitive) and futexes (basic user space thread synchronization primitive) using \emph{Promela} and
666
            primitive) and futexes (basic user space thread synchronization primitive) using \emph{Promela} and
669
            verify several formal properties (deadlock freedom, fairness) in \emph{Spin}~\cite{spin}. As both the Promela language
667
            verify several formal properties (deadlock freedom, fairness) in \emph{Spin}~\cite{spin}. As both the Promela language
670
            and the Spin model checker are mature and commonly used tools for such purposes, we expect no major problems
668
            and the Spin model checker are mature and commonly used tools for such purposes, we expect no major problems
671
            with this approach. Because both synchronization primitives are relatively complex, utilizing a model checker
669
            with this approach. Because both synchronization primitives are relatively complex, utilizing a model checker
672
            should provide a much more trustworthy proof of the required properties than ``paper and pencil''.
670
            should provide a much more trustworthy proof of the required properties than ``paper and pencil''.
673
           
671
           
674
            The initial choice of Spin is motivated by its suitability to model threads, their interaction and verify
672
            The initial choice of Spin is motivated by its suitability to model threads, their interaction and verify
675
            properties related to race conditions and deadlocks (which is the common sources of OS-related bugs). Other
673
            properties related to race conditions and deadlocks (which is the common sources of OS-related bugs). Other
676
            modeling formalisms might be more suitable for different goals.
674
            modeling formalisms might be more suitable for different goals.
677
       
675
       
678
        \subsection{Architecture and Behavior Checker}
676
        \subsection{Architecture and Behavior Checker}
679
            We have created an architecture description in ADL language derived from \emph{SOFA ADL}~\cite{adl} for the
677
            We have created an architecture description in ADL language derived from \emph{SOFA ADL}~\cite{adl} for the
680
            majority of the HelenOS components and created the Behavior Protocol specification of these components.
678
            majority of the HelenOS components and created the Behavior Protocol specification of these components.
681
            Both descriptions were created independently, not by reverse-engineering the existing source code.
679
            Both descriptions were created independently, not by reverse-engineering the existing source code.
682
            The architecture is a snapshot of the dynamic architecture just after a successful bootstrap of HelenOS.
680
            The architecture is a snapshot of the dynamic architecture just after a successful bootstrap of HelenOS.
683
           
681
           
684
            Both the architecture and behavior description is readily available as part of the source code repository
682
            Both the architecture and behavior description is readily available as part of the source code repository
685
            of HelenOS, including tools which can preprocess the Behavior Protocols according to the architecture description
683
            of HelenOS, including tools which can preprocess the Behavior Protocols according to the architecture description
686
            and create an output suitable for \emph{bp2promela} checker~\cite{bp}.
684
            and create an output suitable for \emph{bp2promela} checker~\cite{bp}.
687
           
685
           
688
            As the resulting complexity of the description is larger than any of the previously published case studies
686
            As the resulting complexity of the description is larger than any of the previously published case studies
689
            on Behavior Protocols (compare to~\cite{cocome}), our current work-in-progress is to optimize and fine-tune
687
            on Behavior Protocols (compare to~\cite{cocome}), our current work-in-progress is to optimize and fine-tune
690
            the bp2promela checker to process the input.
688
            the bp2promela checker to process the input.
691
           
689
           
692
            \medskip
690
            \medskip
693
           
691
           
694
            We have not started to generate code from the architecture description so far because of time constrains.
692
            We have not started to generate code from the architecture description so far because of time constrains.
695
            However, we believe that this is a very promising way to go and provide reasonable guarantees about
693
            However, we believe that this is a very promising way to go and provide reasonable guarantees about
696
            the compliance between the specification and the implementation.
694
            the compliance between the specification and the implementation.
697
       
695
       
698
        \subsection{Behavior Description Generator}
696
        \subsection{Behavior Description Generator}
699
            We have not tackled the issue of behavior description generation yet, although tools such as
697
            We have not tackled the issue of behavior description generation yet, although tools such as
700
            \emph{Procasor}~\cite{procasor} are readily available. We do not consider it our priority at this time.
698
            \emph{Procasor}~\cite{procasor} are readily available. We do not consider it our priority at this time.
701
   
699
   
702
    \section{Conclusion}
700
    \section{Conclusion}
703
        \label{conclusion}
701
        \label{conclusion}
704
        In this paper we propose a complex combination of various verification methods and tools
702
        In this paper we propose a complex combination of various verification methods and tools
705
        to achieve the verification of an entire general-purpose operating system. The proposed
703
        to achieve the verification of an entire general-purpose operating system. The proposed
706
        approach generally follows a bottom-up route, starting with low-level checks using state-of-the-art
704
        approach generally follows a bottom-up route, starting with low-level checks using state-of-the-art
707
        verifying C language compilers, following by static analyzers and static verifiers.
705
        verifying C language compilers, following by static analyzers and static verifiers.
708
        In specific contexts regression and unit tests, code instrumentation and model checkers
706
        In specific contexts regression and unit tests, code instrumentation and model checkers
709
        for the sake of verification of key algorithms are utilized.
707
        for the sake of verification of key algorithms are utilized.
710
       
708
       
711
        Thanks to the properties of state-of-the-art microkernel multiserver operating
709
        Thanks to the properties of state-of-the-art microkernel multiserver operating
712
        system design (e.g. software component encapsulation and composition, fine-grained isolated
710
        system design (e.g. software component encapsulation and composition, fine-grained isolated
713
        components), we demonstrate that it should be feasible to successfully verify larger and more
711
        components), we demonstrate that it should be feasible to successfully verify larger and more
714
        complex operating systems than in the case of monolithic designs. We use formal component
712
        complex operating systems than in the case of monolithic designs. We use formal component
715
        architecture and behavior description for the closure. The final goal -- a formally verified
713
        architecture and behavior description for the closure. The final goal -- a formally verified
716
        operating system -- is the emerging property of the combination of the various methods.
714
        operating system -- is the emerging property of the combination of the various methods.
717
       
715
       
718
        \medskip
716
        \medskip
719
       
717
       
720
        The contribution of this paper is the shift of focus from attempts to use a single
718
        The contribution of this paper is the shift of focus from attempts to use a single
721
        ``silver-bullet'' method for formal verification of an operating system to a combination
719
        ``silver-bullet'' method for formal verification of an operating system to a combination
722
        of multiple methods supported by a suitable architecture of the operating system.
720
        of multiple methods supported by a suitable architecture of the operating system.
723
        The main benefit is a much larger coverage of the set of all hypothetical properties.
721
        The main benefit is a much larger coverage of the set of all hypothetical properties.
724
       
722
       
725
        We also argue that the approach should be suitable for the mainstream
723
        We also argue that the approach should be suitable for the mainstream
726
        general-purpose operating systems in the near future, because they are gradually
724
        general-purpose operating systems in the near future, because they are gradually
727
        incorporating more microkernel-based features and fine-grained software components.
725
        incorporating more microkernel-based features and fine-grained software components.
728
       
726
       
729
        Although the evaluation of the proposed approach on HelenOS is still work-in-progress, the
727
        Although the evaluation of the proposed approach on HelenOS is still work-in-progress, the
730
        preliminary results and estimates are promising.
728
        preliminary results and estimates are promising.
731
       
729
       
732
        \medskip
730
        \medskip
733
       
731
       
734
        \noindent\textbf{Acknowledgments.} The author would like to express his gratitude to all contributors of
732
        \noindent\textbf{Acknowledgments.} The author would like to express his gratitude to all contributors of
735
        the HelenOS project. Without their vision and dedication the work on this paper would be almost impossible
733
        the HelenOS project. Without their vision and dedication the work on this paper would be almost impossible
736
       
734
       
737
        This work was partially supported by the Ministry of Education of the Czech Republic
735
        This work was partially supported by the Ministry of Education of the Czech Republic
738
        (grant MSM\-0021620838).
736
        (grant MSM\-0021620838).
739
   
737
   
740
    \begin{thebibliography}{99}
738
    \begin{thebibliography}{99}
741
        \bibitem{billion}Bessey A., Block K., Chelf B., Chou A., Fulton B., Hallem S., Henri-Gros C., Kamsky A., McPeak S., Engler D.: \emph{A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World}, Communications of the ACM, Vol. 53 No. 2, pp 66-75, 2010
739
        \bibitem{billion}Bessey A., Block K., Chelf B., Chou A., Fulton B., Hallem S., Henri-Gros C., Kamsky A., McPeak S., Engler D.: \emph{A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World}, Communications of the ACM, Vol. 53 No. 2, pp 66-75, 2010
742
        \bibitem{bp}Kofron J.: \emph{Checking Software Component Behavior Using Behavior Protocols and Spin}, in proceedings of Applied Computing 2007, Seoul, Korea, pp. 1513-1517, 2007
740
        \bibitem{bp}Kofron J.: \emph{Checking Software Component Behavior Using Behavior Protocols and Spin}, in proceedings of Applied Computing 2007, Seoul, Korea, pp. 1513-1517, 2007
743
        \bibitem{gcc}GCC, the GNU Compiler Collection, \texttt{http://gcc.gnu.org/}
741
        \bibitem{gcc}GCC, the GNU Compiler Collection, \texttt{http://gcc.gnu.org/}
744
        \bibitem{clang}Clang: a C language family frontend for LLVM, \texttt{http://clang.llvm.org/}
742
        \bibitem{clang}Clang: a C language family frontend for LLVM, \texttt{http://clang.llvm.org/}
745
        \bibitem{clanganalyzer}Clang Static Analyzer, \texttt{http://clang-analyzer.llvm.org/}
743
        \bibitem{clanganalyzer}Clang Static Analyzer, \texttt{http://clang-analyzer.llvm.org/}
746
        \bibitem{cocome}Bulej L., Bures T., Coupaye T., Decky M., Jezek P., Parizek P., Plasil F., Poch T., Rivierre N., Sery O., Tuma P.: \emph{CoCoME in Fractal}, chapter in The Common Component Modeling Example: Comparing Software Component Models, Springer-Verlag, LNCS 5153, 2008
744
        \bibitem{cocome}Bulej L., Bures T., Coupaye T., Decky M., Jezek P., Parizek P., Plasil F., Poch T., Rivierre N., Sery O., Tuma P.: \emph{CoCoME in Fractal}, chapter in The Common Component Modeling Example: Comparing Software Component Models, Springer-Verlag, LNCS 5153, 2008
747
        \bibitem{coverity}Coverity, \texttt{http://scan.coverity.com/}
745
        \bibitem{coverity}Coverity, \texttt{http://scan.coverity.com/}
748
        \bibitem{procasor}Mencl V.: \emph{Deriving Behavior Specifications from Textual Use Cases}, in Proceedings of Workshop on Intelligent Technologies for Software Engineering (WITSE04, Sep 21, 2004, part of ASE 2004), Linz, Austria, pp. 331 - 341, Oesterreichische Computer Gesellschaft, 2004
746
        \bibitem{procasor}Mencl V.: \emph{Deriving Behavior Specifications from Textual Use Cases}, in Proceedings of Workshop on Intelligent Technologies for Software Engineering (WITSE04, Sep 21, 2004, part of ASE 2004), Linz, Austria, pp. 331 - 341, Oesterreichische Computer Gesellschaft, 2004
749
        \bibitem{framac}Frama-C, \texttt{http://frama-c.cea.fr/}
747
        \bibitem{framac}Frama-C, \texttt{http://frama-c.cea.fr/}
750
        \bibitem{c}Lawlis P. K: \emph{Guidelines for Choosing a Computer Language: Support for the Visionary Organization}, Ada Information Clearinghouse, \texttt{http://archive.adaic.com/docs/reports/lawlis/k.htm}, 1998
748
        \bibitem{c}Lawlis P. K: \emph{Guidelines for Choosing a Computer Language: Support for the Visionary Organization}, Ada Information Clearinghouse, \texttt{http://archive.adaic.com/docs/reports/lawlis/k.htm}, 1998
751
        \bibitem{helenos}HelenOS Project, \texttt{http://www.helenos.org/}
749
        \bibitem{helenos}HelenOS Project, \texttt{http://www.helenos.org/}
752
        \bibitem{adl}Oplustil T.: \emph{Inheritance in Architecture Description Languages}, reviewed section of Proceedings of the Week of Doctoral Students 2003 conference (WDS 2003), Charles University, Prague, Czech Republic, 2003, pp. 124 - 131, 2003
750
        \bibitem{adl}Oplustil T.: \emph{Inheritance in Architecture Description Languages}, reviewed section of Proceedings of the Week of Doctoral Students 2003 conference (WDS 2003), Charles University, Prague, Czech Republic, 2003, pp. 124 - 131, 2003
753
        \bibitem{marketshare}Operating System Market Share, \texttt{http://marketshare.hitslink.com/report\-.aspx?qprid=8\&qptimeframe=M\&qpsp=132}, retrieved on 2010-02-28
751
        \bibitem{marketshare}Operating System Market Share, \texttt{http://marketshare.hitslink.com/report\-.aspx?qprid=8\&qptimeframe=M\&qpsp=132}, retrieved on 2010-02-28
754
        \bibitem{seL4}Klein G., Elphinstone K., Heiser G., Andronick J., Cock D., Derrin P., Elkaduwe D., Engelhardt K., Kolanski R., Norrish M., Sewell T., Tuch H., Winwood S.: \emph{seL4: Formal verification of an OS kernel}, Proceedings of the 22nd ACM Symposium on Operating Systems Principles, Big Sky, MT, USA, 2009
752
        \bibitem{seL4}Klein G., Elphinstone K., Heiser G., Andronick J., Cock D., Derrin P., Elkaduwe D., Engelhardt K., Kolanski R., Norrish M., Sewell T., Tuch H., Winwood S.: \emph{seL4: Formal verification of an OS kernel}, Proceedings of the 22nd ACM Symposium on Operating Systems Principles, Big Sky, MT, USA, 2009
755
        \bibitem{spin}Spin, \texttt{http://spinroot.com/}
753
        \bibitem{spin}Spin, \texttt{http://spinroot.com/}
756
        \bibitem{stanse}Stanse: Static Analysis Framework for C code, \texttt{http://stanse.fi.muni.cz/}
754
        \bibitem{stanse}Stanse: Static Analysis Framework for C code, \texttt{http://stanse.fi.muni.cz/}
757
        \bibitem{valgrind}Valgrind, \texttt{http://valgrind.org/}
755
        \bibitem{valgrind}Valgrind, \texttt{http://valgrind.org/}
758
        \bibitem{vcc}VCC, \texttt{http://vcc.codeplex.com/}
756
        \bibitem{vcc}VCC, \texttt{http://vcc.codeplex.com/}
759
    \end{thebibliography}
757
    \end{thebibliography}
760
\end{document}
758
\end{document}
761
 
759