Rev 180 | Rev 182 | Go to most recent revision | Show entire file | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed
Rev 180 | Rev 181 | ||
---|---|---|---|
Line 284... | Line 284... | ||
284 | Please note that the titles of the following sections do not follow any particular established |
284 | Please note that the titles of the following sections do not follow any particular established |
285 | taxonomy. We have simply chosen the names to be intuitivelly descriptive. |
285 | taxonomy. We have simply chosen the names to be intuitivelly descriptive. |
286 | |
286 | |
287 | \subsection{C Language Compiler and Continuous Integration Tool} |
287 | \subsection{C Language Compiler and Continuous Integration Tool} |
288 | \label{clang} |
288 | \label{clang} |
289 | The initial levels of abstraction do not go far from the C source code. First, we would certainly like to |
289 | The initial levels of abstraction do not go far from the C source code and common engineering |
290 | know whether our code base is compliant with the programming language specification and passes |
290 | approaches. First, we would certainly like to know whether our code base is compliant with the |
- | 291 | programming language specification and passes only the basic semantic checks (proper number |
|
291 | only the basic semantic checks (proper number and types of arguments passed to functions, etc.). |
292 | and types of arguments passed to functions, etc.). It is perhaps not very surprising that |
292 | It is perhaps not very surprising that these decisions can be make by any plain C compiler. |
293 | these decisions can be make by any plain C compiler. However, with the current implementation |
293 | However, with the current implementation of HelenOS even this is not quite trivial. |
294 | of HelenOS even this is not quite trivial. |
- | 295 | |
|
294 | Besides the requirement to support 7 hardware platforms, the system's compile-time configuration |
296 | Besides the requirement to support 7 hardware platforms, the system's compile-time configuration |
295 | can be also affected by approximately 65 configuration options, most of which are booleans, the rest |
297 | can be also affected by approximately 65 configuration options, most of which are booleans, |
296 | are enumerated types. |
298 | the rest are enumerated types. |
297 | |
299 | |
298 | These configuration options are bound by logical propositions in conjunctive or disjunctive |
300 | These configuration options are bound by logical propositions in conjunctive or disjunctive |
299 | normal forms and while some options are freely configurable, the value of others gets inferred |
301 | normal forms and while some options are freely configurable, the value of others gets inferred |
300 | by the build system of HelenOS. Thus, the overall number of distinct configurations in which |
302 | by the build system of HelenOS. The overall number of distinct configurations in which |
301 | HelenOS can be compiled is at least one order of magnitude larger than the plain number |
303 | HelenOS can be compiled is at least one order of magnitude larger than the plain number |
302 | of supported hardware platforms. |
304 | of supported hardware platforms. |
303 | |
305 | |
304 | Various configuration options affect conditional compilation and linking. The programmers |
306 | Various configuration options affect conditional compilation and linking. The programmers |
305 | are used to make sure that the source code compiles and links fine with respect to the |
307 | are used to make sure that the source code compiles and links fine with respect to the |
Line 310... | Line 312... | ||
310 | |
312 | |
311 | A straightforward solution is to generate all distinct configurations, starting from the |
313 | A straightforward solution is to generate all distinct configurations, starting from the |
312 | open variables and inferring the others. This can be part of the continuous integration |
314 | open variables and inferring the others. This can be part of the continuous integration |
313 | process which would try to compile and link the sources in all distinct configurations. |
315 | process which would try to compile and link the sources in all distinct configurations. |
314 | |
316 | |
315 | If we want to be really pedantic, we should also make sure that we run all the relevant higher |
317 | If we want to be really pedantic, we should also make sure that we run all higher |
316 | level verification methods on all configurations generated by this step. That would certainly |
318 | level verification methods on all configurations generated by this step. That would certainly |
317 | require to multiply the time required by the verification methods at least by the number |
319 | require to multiply the time required by the verification methods at least by the number |
318 | of the distinct configurations. Constraining the set of configurations to just the most |
320 | of the distinct configurations. Constraining the set of configurations to just the most |
319 | representative ones is perhaps a reasonable compromise. |
321 | representative ones is perhaps a reasonable compromise to make the verification realistic. |
320 | |
322 | |
321 | \subsection{Regression and Unit Tests} |
323 | \subsection{Regression and Unit Tests} |
322 | Although some people would argue whether testing is a formal verification method, we still include |
- | |
323 | it into the big picture. Running regression and unit tests which are part of HelenOS code base |
324 | Running regression and unit tests which are part of HelenOS code base in the continuous |
324 | in the continuous integration process is fairly easy. The only complication lies in the technicalities: |
325 | integration process is fairly easy. The only complication lies in the technicalities: |
325 | We need to setup an automated network of physical machines and simulators which can run the |
326 | We need to setup an automated network of physical machines and simulators which can run the |
326 | appropriate compilation outputs for the specific platforms. We need to be able to reboot |
327 | appropriate compilation outputs for the specific platforms. We need to be able to reboot |
327 | them remotely and distribute the boot images to them. And last but not least, we need to be |
328 | them remotely and distribute the boot images to them. And last but not least, we need to be |
328 | able to gather the results from them. |
329 | able to gather the results from them. |
- | 330 | |
|
- | 331 | Testing is always non-exhaustive, thus the guarantees provided by tests are strictly limited |
|
- | 332 | to the use cases and contexts which are being explicitly tested. However, it is arguably |
|
- | 333 | easier to express many common use cases in the primary programming language than in some |
|
- | 334 | different formalism. As we follow the bottom-up approach, filtering out the most obvious |
|
- | 335 | bugs by testing can save us a lot of valuable time which would be otherwise waisted by |
|
- | 336 | a futile verification by more formal (and more time-consuming) methods. |
|
329 | |
337 | |
330 | \subsection{Instrumentation} |
338 | \subsection{Instrumentation} |
331 | Instrumentation tools for detecting memory leaks, performance bottlenecks and soft-deadlocks |
339 | Instrumentation tools for detecting memory leaks, performance bottlenecks and soft-deadlocks |
332 | are also not usually considered to be formal verification tools. They are also rarely utilized |
340 | are also not usually considered to be formal verification tools (since it is hard to define |
- | 341 | exact formal properties which are being verified by the non-exhaustive nature of these tools). |
|
333 | on regular basis as part of the continuous integration process. But again, it might be helpful |
342 | They are also rarely utilized on regular basis as part of the continuous integration process. |
334 | to just mention them in the context of regression and unit tests. |
343 | But again, it might be helpful to just mention them in the big picture. |
335 | |
344 | |
336 | If some regression or unit tests fail, they sometimes do not give sufficient information to |
345 | If some regression or unit tests fail, they sometimes do not give sufficient information to |
337 | tell immediately what is the root cause of the issue. In that case running the faulting tests |
346 | tell immediately what is the root cause of the issue. In that case running the faulting tests |
338 | on manually or automatically instrumented executable code might provide more data and point |
347 | on manually or automatically instrumented executable code might provide more data and point |
339 | more directly to the actual bug. |
348 | more directly to the actual bug. |
340 | |
349 | |
341 | \subsection{Verifying C Language Compiler} |
350 | \subsection{Verifying C Language Compiler} |
342 | C language compilers are traditionally not considered to be formal verification tools. Many people |
351 | C language compilers are traditionally also not considered to be formal verification tools. |
343 | just say that C compilers are good at generating executable code, but do not care much about the semantics |
352 | Many people just say that C compilers are good at generating executable code, but do not |
- | 353 | care much about the semantics of the source code (on the other hand, formal verification |
|
344 | of the source code (on the other hand, formal verification tools usually do not generate any executable code |
354 | tools usually do not generate any executable code at all). However, with recent development |
345 | at all). However, with recent development in the compiler domain, the old paradigms are shifting. |
355 | in the compiler domain, the old paradigms are shifting. |
346 | |
356 | |
347 | As the optimization passes and general maturity of the compilers improve over time, |
357 | As the optimization passes and general maturity of the compilers improve over time, |
348 | the compilers try to extract and use more and more semantic information from the source code. |
358 | the compilers try to extract and use more and more semantic information from the source code. |
349 | The C language is quite poor on explicit semantic information, but the verifying compilers |
359 | The C language is quite poor on explicit semantic information, but the verifying compilers |
350 | try to rely on vendor-specific language extensions and on the fact that some semantic information |
360 | try to rely on vendor-specific language extensions and on the fact that some semantic information |
Line 353... | Line 363... | ||
353 | The checks done by the verifying compilers cannot result in fatal errors in the usual cases (they |
363 | The checks done by the verifying compilers cannot result in fatal errors in the usual cases (they |
354 | are just warnings). Firstly, the compilers still need to successfully compile a well-formed C source |
364 | are just warnings). Firstly, the compilers still need to successfully compile a well-formed C source |
355 | code compliant to some older standard (e.g. C89) even when it is not up with the current quality |
365 | code compliant to some older standard (e.g. C89) even when it is not up with the current quality |
356 | expectations. Old legacy source code should still pass the compilation as it did decades ago. |
366 | expectations. Old legacy source code should still pass the compilation as it did decades ago. |
357 | |
367 | |
358 | Secondly, the checks run by the verifying compilers are usually not based on abstract interpretation |
368 | Secondly, the checks run by the verifying compilers are usually not based on abstract interpretation. |
359 | or exhaustive traversal of a model state space. They are mostly realized as abstract syntax tree |
369 | They are mostly realized as abstract syntax tree transformations much in the line with the supporting |
360 | transformations much in the line with the supporting routines of the compilation process (data |
370 | routines of the compilation process (data and control flow graph analysis, dead code elimination, |
361 | and control flow graph analysis, dead code elimination, register allocation, etc.) and the evaluation |
371 | register allocation, etc.) and the evaluation function is basically the matching of antipatterns |
362 | function is basically the matching of antipatterns of common programming bugs. |
372 | of common programming bugs. |
363 | |
373 | |
364 | The checks are usually conservative. The verifying compilers identify code constructs which are suspicious, |
374 | The checks are usually conservative. The verifying compilers identify code constructs which are suspicious, |
365 | which might arise out of programmer's bad intuition and so on, but even these code snippets cannot be |
375 | which might arise out of programmer's bad intuition and so on, but even these code snippets cannot be |
366 | tagged as definitive bugs (since the programmer can be simply in a position where he/she really wants to |
376 | tagged as definitive bugs (since the programmer can be simply in a position where he/she really wants to |
367 | do something very strange, but nevertheless legitimate). It is upon the programmer |
377 | do something very strange, but nevertheless legitimate). It is upon the programmer |
368 | to examine the root cause of the compiler warning, tell whether it is really a bug or just a false |
378 | to examine the root cause of the compiler warning, tell whether it is really a bug or just a false |
369 | positive and fix the issue by either amending some additional semantic information (e.g. adding an |
379 | positive and fix the issue by either amending some additional semantic information (e.g. adding an |
370 | explicit typecast or a vendor-specific language extension) or rewriting the code. |
380 | explicit typecast or a vendor-specific language extension) or rewriting the code. |
- | 381 | |
|
- | 382 | Although this level of abstraction is coarse-grained and conservative, it can be called semi-formal, |
|
- | 383 | since the properties which are being verified can be actually defined quite exactly and they |
|
- | 384 | are reasonably general. They do not deal with single traces of methods, runs and use |
|
- | 385 | cases like tests, but they deal with all possible contexts in which the code can run. |
|
371 | |
386 | |
372 | \subsection{Static Analyzer} |
387 | \subsection{Static Analyzer} |
373 | Static analyzers try to go deeper than verifying compilers. Besides detecting common antipatterns of |
388 | Static analyzers try to go deeper than verifying compilers. Besides detecting common antipatterns of |
374 | bugs, they also use techniques such as abstract interpretation to check for more complex properties. |
389 | bugs, they also use techniques such as abstract interpretation to check for more complex properties. |
375 | |
390 | |
Line 379... | Line 394... | ||
379 | from the memory access model, allocation and deallocation rules, tracking of references and tracking of |
394 | from the memory access model, allocation and deallocation rules, tracking of references and tracking of |
380 | concurrency locks. |
395 | concurrency locks. |
381 | |
396 | |
382 | The biggest advantage of static analyzers is that they can be easily included in the development or |
397 | The biggest advantage of static analyzers is that they can be easily included in the development or |
383 | continuous integration process as an additional automated step, very similar to the verifying compilers. |
398 | continuous integration process as an additional automated step, very similar to the verifying compilers. |
384 | No definition of properties is needed and false positives can be relatively easily eliminated by amending |
399 | No manual definition of code-specific properties is needed and false positives can be relatively easily |
385 | some explicit additional information to the source code within the boundaries of the programming language. |
400 | eliminated by amending some explicit additional information to the source code within the boundaries |
- | 401 | of the programming language. |
|
386 | |
402 | |
387 | The authors of static analyzers claim large quantities of bugs detected or prevented~\cite{billion}, but static |
403 | The authors of static analyzers claim large quantities of bugs detected or prevented~\cite{billion}, |
388 | analyzers are still relatively limited by the kind of bugs they are designed to detect. They |
404 | but static analyzers are still relatively limited by the kind of bugs they are designed to detect. |
389 | are usually good at pointing out common issues with security implications (specific types of buffer |
405 | They are usually good at pointing out common issues with security implications (specific types of |
390 | and stack overruns, usage of well-known functions in an unsafe way, clear cases of forgotten |
406 | buffer and stack overruns, usage of well-known functions in an unsafe way, clear cases of forgotten |
391 | deallocation of resources and release of locks, etc.). Unfortunately, many static analyzers |
407 | deallocation of resources and release of locks, etc.). Unfortunately, many static analyzers |
392 | only analyze a single-threaded control flow and are thus unable to detect concurrency issues |
408 | only analyze a single-threaded control flow and are thus unable to detect concurrency issues |
393 | such as deadlocks. |
409 | such as deadlocks. |
394 | |
410 | |
395 | \subsection{Static Verifier} |
411 | \subsection{Static Verifier} |
Line 399... | Line 415... | ||
399 | and have the capability to check proper locking order, hand-over-hand locking and even liveliness. |
415 | and have the capability to check proper locking order, hand-over-hand locking and even liveliness. |
400 | |
416 | |
401 | In the context of an operating system kernel and core libraries two kinds of properties are |
417 | In the context of an operating system kernel and core libraries two kinds of properties are |
402 | common: |
418 | common: |
403 | |
419 | |
404 | \begin{itemize} |
420 | \begin{description} |
405 | \item \emph{Consistency constrains:} These properties define the correct way how data is supposed |
421 | \item[Consistency constrains] These properties define the correct way how data is supposed |
406 | to be manipulated by some related set of subroutines. Checking for these |
422 | to be manipulated by some related set of subroutines. Checking for these |
407 | properties ensures that data structures and internal states will not get corrupt due |
423 | properties ensures that data structures and internal states will not get corrupt due |
408 | to bugs in the functions and methods which are designed to manipulate them. |
424 | to bugs in the functions and methods which are designed to manipulate them. |
409 | \item \emph{Interface enforcement:} These properties define the correct semantics by which |
425 | \item[Interface enforcements] These properties define the correct semantics by which |
410 | a set of subroutines should be used by the rest of the code. Checking for these properties |
426 | a set of subroutines should be used by the rest of the code. Checking for these properties |
411 | ensures that some API is always used by the rest of the code in a specified way |
427 | ensures that some API is always used by the rest of the code in a specified way |
412 | and all possible error states are detected and reported. |
428 | and all reported exceptions are handled by the client code. |
413 | \end{itemize} |
429 | \end{description} |
414 | |
430 | |
415 | \subsection{Model Checker} |
431 | \subsection{Model Checker} |
416 | \label{modelcheck} |
432 | \label{modelcheck} |
417 | On the first sight it does not seem to be reasonable to consider general model checkers as |
433 | On the first sight it does not seem to be reasonable to consider general model checkers as |
418 | relevant independent tools for formal verification of an existing operating system. While many |
434 | relevant independent tools for formal verification of an existing operating system. While many |
Line 429... | Line 445... | ||
429 | |
445 | |
430 | If the implementation of these algorithms and protocols do not behave correctly, we can be sure |
446 | If the implementation of these algorithms and protocols do not behave correctly, we can be sure |
431 | that the root cause is in the non-compliance between the design and implementation and not a |
447 | that the root cause is in the non-compliance between the design and implementation and not a |
432 | fundamental flaw of the design itself. |
448 | fundamental flaw of the design itself. |
433 | |
449 | |
434 | \subsection{Behavior Checker} |
450 | \subsection{Architecture and Behavior Checker} |
435 | All previously mentioned verification methods were targeting internal properties of the operating system |
451 | All previously mentioned verification methods were targeting internal properties of the OS |
436 | components. If we are moving to a higher-level abstraction in order to specify correct interaction of the encapsulated |
452 | components. If we are moving to a higher-level abstraction in order to specify correct |
437 | components in terms of interface compatibility and communication, we can utilize \emph{Behavior Protocols}~\cite{bp}. |
453 | interaction of the encapsulated components in terms of interface compatibility and communication, |
- | 454 | we can utilize \emph{Behavior Protocols}~\cite{bp} or some other formalism describing correct |
|
- | 455 | interaction between software components. |
|
438 | |
456 | |
439 | To gain the knowledge about the architecture of the whole operating system in terms of software |
457 | To gain the knowledge about the architecture of the whole OS in terms of software |
440 | component composition and bindings, we can use \emph{Architecture Description Language}~\cite{adl}. |
458 | component composition and bindings, we can use \emph{Architecture Description Language}~\cite{adl} |
- | 459 | as the specification of the architecture of the system. This language has the possibility to capture |
|
- | 460 | interface types (with method signatures), primitive components (in terms of provided and required |
|
- | 461 | interfaces), composite components (an architectural compositions of primitive components) and the |
|
- | 462 | bindings between the respective interfaces of the components. |
|
- | 463 | |
|
441 | This language has the possibility to capture interface types (with method signatures), primitive |
464 | It is extremely important to define the right role of the behavior and architecture description. |
- | 465 | A flawed approach would be to reverse-engineer this description from the source code (either manually |
|
- | 466 | or via some sophisticated tool) and then verify the compliance between the desciption and |
|
- | 467 | the implementation. However, different directions can give more interesting results: |
|
- | 468 | |
|
- | 469 | \begin{description} |
|
- | 470 | \item[Description as specification] Behavior and architecture description created independently |
|
- | 471 | on the source code serves the role of specification. This has the following primary |
|
- | 472 | goals of formal verification: |
|
- | 473 | \begin{description} |
|
- | 474 | \item[Horizontal compliance] Also called \emph{compatibility}. The goal is to check |
|
- | 475 | whether the specifications of components that are bound together are semantically |
|
442 | components (in terms of provided and required interfaces), composite components (an architectural |
476 | compatible. All required interfaces need to be bound to provided interfaces and |
- | 477 | the communication between the components cannot lead to \emph{no activity} (a deadlock), |
|
- | 478 | \emph{bad activity} (a livelock) or other communication and synchronization errors. |
|
- | 479 | \item[Vertical compliance] Also called \emph{substituability}. The goal is to check whether |
|
- | 480 | it is possible to replace a set of primitive components that are nested inside a composite |
|
- | 481 | component by the composite component itself. In other words, this compliance can answer the |
|
- | 482 | questions whether the architecture description of the system is sound with respect to the hierarchical |
|
- | 483 | composition of the components. |
|
- | 484 | \item[Compliance between the specification and the implementation] Using various means |
|
- | 485 | for generating artificial environments for an isolated component the checker is able to |
|
- | 486 | partially answer the question whether the implementation of the component is an instantiation |
|
- | 487 | of the component specification. |
|
- | 488 | \end{description} |
|
- | 489 | \item[Description as abstraction] Generating the behavior and architecture description from the |
|
- | 490 | source code by means of abstract interpretation can serve the purpose of verifying various |
|
- | 491 | properties of the implementation such as invariants, preconditions and postconditions. |
|
443 | compositions of primitive components) and the bindings between the respective interfaces of the |
492 | This is similar to static verification, but on the level of component interfaces. |
444 | components. |
493 | \end{description} |
445 | |
494 | |
446 | Unfortunately, the description is usually static, which is not quite suitable for the dynamic |
495 | Unfortunately, most of the behavior and architecture formalisms are static, which is not quite suitable |
447 | nature of HelenOS and other operating systems. This limitation can be circumvented by considering a relevant |
496 | for the dynamic of most OSes. This limitation can be circumvented by considering a relevant |
448 | snapshot of the dynamic run-time architecture. This snapshot fixed in time is equivalent to |
497 | snapshot of the dynamic run-time architecture. This snapshot fixed in time is equivalent to |
449 | a statically defined architecture. |
498 | a statically defined architecture. |
450 | |
499 | |
451 | Behavior Protocol checkers can target three main goals: |
500 | \medskip |
452 | |
501 | |
453 | \begin{itemize} |
- | |
454 | \item \emph{Horizontal compliance:} Also called \emph{compatibility}. The goal is to check |
502 | The key features of software systems with respect to behavior and architecture checkers are the granularity |
455 | whether the specifications of components that are bound together are semantically |
503 | of the individual primitive components, the level of isolation and complexity of the communication mechanism |
456 | compatible. All required interfaces need to be bound to provided interfaces and |
504 | between them. Large monolithic OSes created in sematic-poor C present a severe challenge because the |
457 | the communication between the components cannot lead to \emph{no activity} (a deadlock) |
505 | isolation of the individual components is vague and the communication between them is basically unlimited |
458 | or a \emph{bad activity} (a livelock). |
506 | (function calls, shared resources, etc.). |
459 | \item \emph{Vertical compliance:} Also called \emph{substituability}. The goal is to check whether |
- | |
- | 507 | |
|
460 | it is possible to replace a set of primitive components that are nested inside a composite |
508 | OSes with explicit component architecture and fine-grained components (such as microkernel multiserver |
461 | component by the composite component itself. In other words, this compliance can answer the |
- | |
462 | questions whether the architecture description of the system is sound with respect to the hierarchical |
- | |
463 | composition of the components. |
- | |
464 | \item \emph{Compliance between the specification and the implementation:} Using various means |
509 | systems) make the feasibility of the verification much easier, since the degrees of freedom (and thus |
465 | for generating artificial environments for an isolated component the checker is able to |
- | |
466 | partially answer the question whether the implementation of the component is an instantiation |
- | |
467 | of the component specification. |
510 | the state space) is limited. |
468 | \end{itemize} |
- | |
469 | |
511 | |
470 | Horizontal and vertical compliance checking can be done exhaustively. This is a fundamental property |
512 | Horizontal and vertical compliance checking can be done exhaustively. This is a fundamental property |
471 | which allows the reasoning about the dependability of the entire component-based operating system. |
513 | which allows the reasoning about the dependability of the entire component-based OS. |
472 | Assuming that the lower-level verification methods (described in Sections \ref{clang} to \ref{modelcheck}) |
514 | Assuming that the lower-level verification methods (described in Sections \ref{clang} to \ref{modelcheck}) |
473 | prove some specific properties of the primitive components, we can be sure that the composition of |
515 | prove some specific properties of the primitive components, we can be sure that the composition of |
474 | the primitive components into composite components and ultimately into the whole operating system |
516 | the primitive components into composite components and ultimately into the whole OS |
475 | does not break these properties. |
517 | does not break these properties. |
476 | |
518 | |
477 | The feasibility of many lower-level verification methods from Sections \ref{clang} to \ref{modelcheck} |
519 | The feasibility of many lower-level verification methods from Sections \ref{clang} to \ref{modelcheck} |
478 | depends largely on the size and complexity of the code under verification. If the entire operating |
520 | depends largely on the size and complexity of the code under verification. If the entire OS |
479 | system is decomposed into primitive components with a reasonable granularity, it is more likely that the |
521 | is decomposed into primitive components with a fine granularity, it is more likely that the |
480 | individual primitive components can be verified against a large number of properties. Thanks to the |
522 | individual primitive components can be verified against a large number of properties. Thanks to the |
481 | recursive component composition we can then be sure that these properties also hold for the entire system. |
523 | recursive component composition we can then be sure that these properties also hold for the entire system. |
482 | In monolithic operating systems without a clear decomposition we cannot do this inference and we |
- | |
483 | need to verify the lower-level properties over a much larger code base (e.g. the whole monolithic kernel). |
- | |
484 | |
524 | |
485 | \medskip |
525 | \medskip |
486 | |
526 | |
487 | The compliance between the behavior specification and the actual behavior of the implementation is, unfortunately, |
527 | The compliance between the behavior specification and the actual behavior of the implementation is, unfortunately, |
488 | the missing link in the chain. This compliance cannot be easily verified in an exhaustive manner. If there is |
528 | the missing link in the chain. This compliance cannot be easily verified in an exhaustive manner. If there is |
Line 501... | Line 541... | ||
501 | specification. Although the comparison might not provide clean-cut results, it can still be |
541 | specification. Although the comparison might not provide clean-cut results, it can still be |
502 | helpful to confront the more-or-less informal user expectations on the system with the exact formal description. |
542 | helpful to confront the more-or-less informal user expectations on the system with the exact formal description. |
503 | |
543 | |
504 | \subsection{Summary} |
544 | \subsection{Summary} |
505 | \label{missing} |
545 | \label{missing} |
506 | So far, we have proposed a compact combination of formal methods which start at the level of C programming |
546 | So far, we have proposed a compact combination of engineering, semi-formal and formal methods which |
507 | language, offer the possibility to check for the presence of various common antipatterns, to check for generic |
547 | start at the level of C programming language, offer the possibility to check for the presence of various |
508 | algorithm-related properties, consistency constrains, interface enforcements and conclude with a framework |
548 | common antipatterns, check for generic algorithm-related properties, consistency constrains, interface |
509 | to make these properties hold even in the case of a large operating system composed from many |
549 | enforcements and conclude with a framework to make these properties hold even in the case of a large |
510 | components of compliant behavior. |
550 | OS composed from many components of compliant behavior. |
511 | |
551 | |
512 | We are also able to fill in some of the missing pieces by other software engineering approaches |
552 | We do not claim that there are no missing pieces in the big picture or that the semi-formal verifications |
- | 553 | might provide more guarantees in this setup. However, state-of-the-art OS design guidelines can push |
|
- | 554 | further the boundaries of practical feasibility of the presented methods. The limited guarantees |
|
- | 555 | of the low-level methods hold even in the composition and the high-level formal methods do not have |
|
513 | such as regression and unit testing and instrumentation. |
556 | to deal with unlimited degrees of freedom of the primitive component implementation. |
514 | |
557 | |
515 | \medskip |
558 | \medskip |
516 | |
559 | |
517 | We have spoken only about the functional properties. In general, we cannot apply the same formalisms |
560 | We have spoken only about the functional properties. In general, we cannot apply the same formalisms |
518 | and methods on extra-functional properties (e.g. timing properties, performance properties, etc.). |
561 | and methods on extra-functional properties (e.g. timing properties, performance properties, etc.). |
Line 525... | Line 568... | ||
525 | \label{evaluation} |
568 | \label{evaluation} |
526 | This section copies the structure of the previous Section \ref{analysis} and adds HelenOS-specific |
569 | This section copies the structure of the previous Section \ref{analysis} and adds HelenOS-specific |
527 | evaluation of the the proposed formalisms and tools. As this is still largely a work-in-progress, |
570 | evaluation of the the proposed formalisms and tools. As this is still largely a work-in-progress, |
528 | in many cases just the initial observations can be made. |
571 | in many cases just the initial observations can be made. |
529 | |
572 | |
- | 573 | The choice of the specific methods, tools and formalisms in this initial phase is mostly motivated |
|
- | 574 | by their perceived commonality and author's claims about fitness for the given purpose. An important |
|
- | 575 | part of further evaluation would certainly be to compare multiple particular approaches, tools |
|
- | 576 | and formalisms to find the optimal combination. |
|
- | 577 | |
|
530 | \subsection{Verifying C Language Compiler and Continuous Integration Tool} |
578 | \subsection{Verifying C Language Compiler and Continuous Integration Tool} |
531 | The primary C compiler used by HelenOS is \emph{GNU GCC 4.4.3} (all platforms)~\cite{gcc} and \emph{Clang 2.6.0} |
579 | The primary C compiler used by HelenOS is \emph{GNU GCC 4.4.3} (all platforms)~\cite{gcc} and \emph{Clang 2.6.0} |
532 | (IA-32)~\cite{clang}. We have taken some effort to support also \emph{ICC} and \emph{Sun Studio} C compilers, |
580 | (IA-32)~\cite{clang}. We have taken some effort to support also \emph{ICC} and \emph{Sun Studio} C compilers, |
533 | but the compatibility with these compilers in not guaranteed. |
581 | but the compatibility with these compilers in not guaranteed. |
534 | |
582 | |
Line 541... | Line 589... | ||
541 | as fatal errors, thus the code base must pass without any warnings. |
589 | as fatal errors, thus the code base must pass without any warnings. |
542 | |
590 | |
543 | We also turn on several more specific and strict checks. These checks helped to discover several |
591 | We also turn on several more specific and strict checks. These checks helped to discover several |
544 | latent bugs in the source code: |
592 | latent bugs in the source code: |
545 | |
593 | |
546 | \begin{itemize} |
594 | \begin{description} |
547 | \item \texttt{-Wfloat-equal} Check for exact equality comparison between floating point values. The |
595 | \item[\texttt{-Wfloat-equal}] Check for exact equality comparison between floating point values. The |
548 | usage of equal comparator on floats is usually misguided due to the inherent computational errors |
596 | usage of equal comparator on floats is usually misguided due to the inherent computational errors |
549 | of floats. |
597 | of floats. |
550 | \item \texttt{-Wcast-align} Check for code which casts pointers to a type with a stricter alignment |
598 | \item[\texttt{-Wcast-align}] Check for code which casts pointers to a type with a stricter alignment |
551 | requirement. On many RISC-based platforms this can cause run-time unaligned access exceptions. |
599 | requirement. On many RISC-based platforms this can cause run-time unaligned access exceptions. |
552 | \item \texttt{-Wconversion} Check for code where the implicit type conversion (e.g. from float to integer, |
600 | \item[\texttt{-Wconversion}] Check for code where the implicit type conversion (e.g. from float to integer, |
553 | between signed and unsigned integers or between integers with different number of bits) can |
601 | between signed and unsigned integers or between integers with different number of bits) can |
554 | cause the actual value to change. |
602 | cause the actual value to change. |
555 | \end{itemize} |
603 | \end{description} |
556 | |
604 | |
557 | To enhance the semantic information in the source code, we use GCC-specific language extensions to annotate |
605 | To enhance the semantic information in the source code, we use GCC-specific language extensions to annotate |
558 | some particular kernel and core library routines: |
606 | some particular kernel and core library routines: |
559 | |
607 | |
560 | \begin{itemize} |
608 | \begin{description} |
561 | \item \texttt{\_\_attribute\_\_((noreturn))} Functions marked in this way never finish from the point of view |
609 | \item[\texttt{\_\_attribute\_\_((noreturn))}] Functions marked in this way never finish from the point of view |
562 | of the current sequential execution flow. The most common case are the routines which restore previously saved |
610 | of the current sequential execution flow. The most common case are the routines which restore previously saved |
563 | execution context. |
611 | execution context. |
564 | \item \texttt{\_\_attribute\_\_((returns\_twice))} Functions marked in this way may return multiple times from |
612 | \item[\texttt{\_\_attribute\_\_((returns\_twice))}] Functions marked in this way may return multiple times from |
565 | the point of view of the current sequential execution flow. This is the case of routines which save the current |
613 | the point of view of the current sequential execution flow. This is the case of routines which save the current |
566 | execution context (first the function returns as usual, but the function can eventually ``return again'' |
614 | execution context (first the function returns as usual, but the function can eventually ``return again'' |
567 | when the context is being restored). |
615 | when the context is being restored). |
568 | \end{itemize} |
616 | \end{description} |
569 | |
617 | |
570 | The use of these extensions has pointed out to several hard-to-debug bugs on the IA-64 platform. |
618 | The use of these extensions has pointed out to several hard-to-debug bugs on the IA-64 platform. |
571 | |
619 | |
572 | \medskip |
620 | \medskip |
573 | |
621 | |
Line 586... | Line 634... | ||
586 | and prone to human omissions. An automated approach is definitively going to be very helpful. |
634 | and prone to human omissions. An automated approach is definitively going to be very helpful. |
587 | |
635 | |
588 | \subsection{Instrumentation} |
636 | \subsection{Instrumentation} |
589 | We are in the process of implementing our own code instrumentation framework which is motivated mainly |
637 | We are in the process of implementing our own code instrumentation framework which is motivated mainly |
590 | by the need to support MMU-less architectures in the future. But this framework might be also very helpful |
638 | by the need to support MMU-less architectures in the future. But this framework might be also very helpful |
591 | in detecting memory and generic resource leaks. We have not tried \emph{Valgrind}~\cite{valgrind} or any similar existing tool |
639 | in detecting memory and generic resource leaks. We have not tried \emph{Valgrind}~\cite{valgrind} or any similar |
592 | because of the estimated complexity to adopt it for the usage in HelenOS. |
640 | existing tool because of the estimated complexity to adopt it for the usage in HelenOS. |
593 | |
641 | |
594 | HelenOS was also scanned by \emph{Coverity}~\cite{coverity} in 2006 when no errors were detected. However, since that |
642 | HelenOS was also scanned by \emph{Coverity}~\cite{coverity} in 2006 when no errors were detected. However, since |
595 | time the code base has not been analyzed by Coverity. |
643 | that time the code base has not been analyzed by Coverity. |
596 | |
644 | |
597 | \subsection{Static Analyzer} |
645 | \subsection{Static Analyzer} |
598 | The integration of various static analyzers into the HelenOS continuous integration process is underway. |
646 | The integration of various static analyzers into the HelenOS continuous integration process is underway. |
599 | For the initial evaluation we have used \emph{Stanse}~\cite{stanse} and \emph{Clang Analyzer}~\cite{clanganalyzer}. |
647 | For the initial evaluation we have used \emph{Stanse}~\cite{stanse} and \emph{Clang Analyzer}~\cite{clanganalyzer}. |
600 | Both of them showed to be moderately helpful to point out instances of unreachable dead code, use of language |
648 | Both of them showed to be moderately helpful to point out instances of unreachable dead code, use of language |
Line 620... | Line 668... | ||
620 | primitive) and futexes (basic user space thread synchronization primitive) using \emph{Promela} and |
668 | primitive) and futexes (basic user space thread synchronization primitive) using \emph{Promela} and |
621 | verify several formal properties (deadlock freedom, fairness) in \emph{Spin}~\cite{spin}. As both the Promela language |
669 | verify several formal properties (deadlock freedom, fairness) in \emph{Spin}~\cite{spin}. As both the Promela language |
622 | and the Spin model checker are mature and commonly used tools for such purposes, we expect no major problems |
670 | and the Spin model checker are mature and commonly used tools for such purposes, we expect no major problems |
623 | with this approach. Because both synchronization primitives are relatively complex, utilizing a model checker |
671 | with this approach. Because both synchronization primitives are relatively complex, utilizing a model checker |
624 | should provide a much more trustworthy proof of the required properties than ``paper and pencil''. |
672 | should provide a much more trustworthy proof of the required properties than ``paper and pencil''. |
- | 673 | |
|
- | 674 | The initial choice of Spin is motivated by its suitability to model threads, their interaction and verify |
|
- | 675 | properties related to race conditions and deadlocks (which is the common sources of OS-related bugs). Other |
|
- | 676 | modeling formalisms might be more suitable for different goals. |
|
625 | |
677 | |
626 | \subsection{Behavior Checker} |
678 | \subsection{Architecture and Behavior Checker} |
627 | We have created an architecture description in ADL language similar to SOFA ADL~\cite{adl} for the majority of the |
679 | We have created an architecture description in ADL language derived from \emph{SOFA ADL}~\cite{adl} for the |
628 | HelenOS components and created the Behavior Protocol specification of these components. The architecture |
680 | majority of the HelenOS components and created the Behavior Protocol specification of these components. |
- | 681 | Both descriptions were created independently, not by reverse-engineering the existing source code. |
|
629 | is a snapshot of the dynamic architecture just after a successful bootstrap. |
682 | The architecture is a snapshot of the dynamic architecture just after a successful bootstrap of HelenOS. |
630 | |
683 | |
631 | Both the architecture and behavior description is readily available as part of the source code repository |
684 | Both the architecture and behavior description is readily available as part of the source code repository |
632 | of HelenOS, including tools which can preprocess the Behavior Protocols according to the architecture description |
685 | of HelenOS, including tools which can preprocess the Behavior Protocols according to the architecture description |
633 | and create an output suitable for \emph{bp2promela} checker~\cite{bp}. |
686 | and create an output suitable for \emph{bp2promela} checker~\cite{bp}. |
634 | |
687 | |
635 | As the resulting complexity of the description is larger than any of the previously published case studies |
688 | As the resulting complexity of the description is larger than any of the previously published case studies |
636 | on Behavior Protocols (compare to~\cite{cocome}), our current work-in-progress is to optimize and fine-tune the bp2promela |
689 | on Behavior Protocols (compare to~\cite{cocome}), our current work-in-progress is to optimize and fine-tune |
637 | checker to process the input. |
690 | the bp2promela checker to process the input. |
638 | |
691 | |
639 | \medskip |
692 | \medskip |
640 | |
693 | |
641 | We have not started to generate code from the architecture description so far because of time constrains. However, |
694 | We have not started to generate code from the architecture description so far because of time constrains. |
642 | we believe that this is a very promising way to go. |
695 | However, we believe that this is a very promising way to go and provide reasonable guarantees about |
- | 696 | the compliance between the specification and the implementation. |
|
643 | |
697 | |
644 | \subsection{Behavior Description Generator} |
698 | \subsection{Behavior Description Generator} |
645 | We have not tackled the issue of behavior description generation yet, although tools such as \emph{Procasor}~\cite{procasor} are readily |
699 | We have not tackled the issue of behavior description generation yet, although tools such as |
646 | available. We do not consider it our priority at this time. |
700 | \emph{Procasor}~\cite{procasor} are readily available. We do not consider it our priority at this time. |
647 | |
701 | |
648 | \section{Conclusion} |
702 | \section{Conclusion} |
649 | \label{conclusion} |
703 | \label{conclusion} |
650 | In this paper we propose a complex combination of various formal verification methods and tools |
704 | In this paper we propose a complex combination of various verification methods and tools |
651 | to achieve the verification of an entire general-purpose operating system. The proposed approach generally follows |
705 | to achieve the verification of an entire general-purpose operating system. The proposed |
652 | a bottom-to-top path, starting with low-level checks using state-of-the-art verifying C language compilers, |
706 | approach generally follows a bottom-up route, starting with low-level checks using state-of-the-art |
653 | following by static analyzers and static verifiers. In specific contexts regression and unit tests, |
707 | verifying C language compilers, following by static analyzers and static verifiers. |
654 | code instrumentation and model checkers for the sake of verification of key algorithms |
708 | In specific contexts regression and unit tests, code instrumentation and model checkers |
655 | are utilized. |
709 | for the sake of verification of key algorithms are utilized. |
656 | |
710 | |
657 | Thanks to the properties of state-of-the-art microkernel multiserver operating |
711 | Thanks to the properties of state-of-the-art microkernel multiserver operating |
658 | system design (e.g. software component encapsulation and composition), we demonstrate |
712 | system design (e.g. software component encapsulation and composition, fine-grained isolated |
659 | that it should be feasible to successfully verify larger and more complex operating systems than |
713 | components), we demonstrate that it should be feasible to successfully verify larger and more |
660 | in the case of monolithic designs. We use formal component architecture and behavior |
714 | complex operating systems than in the case of monolithic designs. We use formal component |
661 | description for the closure. The final goal -- a formally verified operating system -- is the |
715 | architecture and behavior description for the closure. The final goal -- a formally verified |
662 | emerging property of the combination of the various methods. |
716 | operating system -- is the emerging property of the combination of the various methods. |
663 | |
717 | |
664 | \medskip |
718 | \medskip |
665 | |
719 | |
666 | The contribution of this paper is the shift of focus from attempts to use a single ``silver-bullet'' |
720 | The contribution of this paper is the shift of focus from attempts to use a single |
667 | method for formal verification of an operating system to a combination of multiple methods supported |
721 | ``silver-bullet'' method for formal verification of an operating system to a combination |
668 | by a suitable architecture of the operating system. |
722 | of multiple methods supported by a suitable architecture of the operating system. |
- | 723 | The main benefit is a much larger coverage of the set of all hypothetical properties. |
|
669 | |
724 | |
670 | We also argue that the approach should be suitable for the mainstream |
725 | We also argue that the approach should be suitable for the mainstream |
671 | general-purpose operating systems in the near future, because they are gradually |
726 | general-purpose operating systems in the near future, because they are gradually |
672 | incorporating more microkernel-based features and fine-grained software components. |
727 | incorporating more microkernel-based features and fine-grained software components. |
673 | |
728 | |
674 | Although the evaluation of the proposed approach on HelenOS is still work-in-progress, the |
729 | Although the evaluation of the proposed approach on HelenOS is still work-in-progress, the |
675 | preliminary results and estimates are promising. |
730 | preliminary results and estimates are promising. |
676 | |
731 | |
677 | \medskip |
732 | \medskip |
678 | |
733 | |
679 | \noindent\textbf{Acknowledgments.} The author would like to express his gratitude to all contributors of the HelenOS |
734 | \noindent\textbf{Acknowledgments.} The author would like to express his gratitude to all contributors of |
680 | project. Without their vision and dedication the work on this paper would be almost impossible |
735 | the HelenOS project. Without their vision and dedication the work on this paper would be almost impossible |
681 | |
736 | |
682 | This work was partially supported by the Ministry of Education of the Czech Republic |
737 | This work was partially supported by the Ministry of Education of the Czech Republic |
683 | (grant MSM\-0021620838). |
738 | (grant MSM\-0021620838). |
684 | |
739 | |
685 | \begin{thebibliography}{99} |
740 | \begin{thebibliography}{99} |