During the development of the HelenOS operating system, we came across
several types of software tools, programs, utilities and libraries.
Some of the tools were used to develop the system itself while other tools
were used to faciliate the development process. In some cases, we had a chance
to try out several versions of the same product. Sometimes the new versions
contained fixes for bugs we had discovered in previous versions thereof.
Another group of software we have used has been integrated into HelenOS
to fill gaps after functionality that the genuine HelenOS code did
not provide itself.
There is simply too much third party software that is somehow related to
HelenOS to be covered all. This chapter attempts to present our experience
with the key software tools, programs and libraries.
Although the developers know each other in person, the development, with the
exception of kernel camps, has been pretty much independent as far as locality
and time goes. In order to work effectively, we have established several communication
channels:
\begin{description}
\item [E-mail] --- We used this basic means of electronic communication for peer-to-peer
discussion in cases when the other person could not have been reached on-line at
the time his advice was needed or his attention was demanded. E-mail was also
used for contacting developers of third party software that we needed to talk to.
\item [Mailing list] --- As almost every open source project before us, also we opened
mailing list for technical discussion. The advantage of having a mailing list is
the fact that it enables multilateral discussions on several topics contemporarily,
without the need for all the participants be on-line or even at one place. We have kept
our first development mailing list closed to public so that it seemed natural to us
to use Czech as our communication language on the list since Czech, with one exception,
is our native language and all of us speak it very well. Besides all the advantages,
there are also disadvantages. First, communication over mailing list tends to be rather
slow, compared for instance to ICQ. Second, because of its implicit collective nature,
it sometimes tends to be so slow that an answer for a given question never comes.
Apart from the internal development mailing list, we have also used another mailing list
for commit log messages which proved handy in keeping developers informed about all changes in
the repository.
Finally, we have also established a public mailing list for communication
about general HelenOS topics in English.
\item [ICQ] --- Because we divided the whole project into smaller subprojects on which
only the maximum of two people out of six would work together, the need for communication
among all six people was significantly smaller than the need to communicate between the two
developers who tightly cooperated on a specific task. For this reason, we made the biggest
use of ICQ.
\end{description}
\section{Concurrent versions systems}
At the very beginning, when the SPARTAN kernel was being developed solely
by \JJ, there was not much sence in using any software for management of
concurrent versions. However, when the number of developers increased to six,
we immediately started to think of available solutions.
We have begun with CVS because it is probably the best known file concurrent
versions system. We have even had repository of HelenOS using CVS for a short time,
but when we learned about its weaknesses we sought another solution. There are two
weaknesses that have prevented us from using CVS:
\begin{itemize}
\item it is merely a file concurrent versions system (i.e. CVS is
good at managing versions of each separate file in the repository
but has no clue about the project's directory tree as a whole;
specifically renaming of a file while preserving its revision history
is next to impossible),
\item it lacks atomic commits (i.e. should your commit conflict with
another recent commit of another developer, CVS would not abort the whole operation
but render the repository inconsistent instead).
\end{itemize}
Being aware of these limitations, we decided to go with Subversion. Subversion
is, simply put, a redesigned CVS with all the limitations fixed. We were
already familiar with CVS so the switch to Subversion was pretty seamless.
As for Subversion itself, it has worked for us well and has met all our
expectations. Despite all its pros, there was a serious problem that
occurred sometime in the middle of the development process. Because of some locking
issues related to the default database backend (i.e. {\tt Berkeley DB}),
our Subversion repository put itself in a peculiar state in which it became
effectivelly inaccessible by any means of standard usage or administration.
To mitigate this problem, we had to manually delete orphaned file locks
and switch to backend called {\tt fsfs} which doesn't suffer this
problem.
Other than that, we are happy users of Subversion. The ability to switch
the entire working copy to particular revision is a great feature
for debugging. Once we tracked a bug three months into the past by
moving through revisions until we found the change that caused the bug.
On our project website\cite{helenos}, we provided links to different
web utilities that either functioned to access our Subversion repository
or mailing list or provided another services:
\begin{description}
\item [Chora] is a part of the Horde framework and can be used to comfortably
browse Subversion repository from the web. We altered it a little bit to also
show number of commits per developer on our homepage.
\item [Whups] is another component of the Horde framework. It provides
feature request and bug tracking features. However, in the light of being rather
closed group of people, we used this tool only seldomly. On the other hand,
any possible beta tester of our operating system has had a chance to
submit bug reports.
\item [Mailman] is a web interface to the mailing list we utilized. It allows
to control subsriptions and search mailing list archives on-line.
\end{description}
\section{Third party components of HelenOS}
HelenOS itself contains third party software. In the first place, amd64 and ia32 architectures
make use of the GNU Grub boot loader. This software replaced the original limited boot loader
after the Kernel Camp 2005 when {\MD} had made HelenOS Multiboot specification compliant. Because of
Grub, HelenOS can be booted from several types of devices. More importantly, we use
Grub to load HelenOS userspace modules as well.
Another third-party piece of the HelenOS operating system is the userspace {\tt malloc()}.
Rather than porting our kernel slab allocator to userspace, we have chosen Doug Lea's public
domain {\tt dlmalloc} instead. This allocator could be easily integrated into our uspace tree
and has proven itself in other projects as well. Its derivative, {\tt ptmalloc}, has been part of the
GNU C library for some time. However, the version we are using is not optimized for SMP and multithreading.
We plan to eventually replace it with another allocator.
Next, the {\tt pci} userspace task is using the {\tt libpci} library. The
library was simplified and ported to HelenOS. Even though filesystem
calls were removed from the library, it still heavily depends on {\tt libc}.
By porting {\tt libpci} to HelenOS, we demonstrated that applications and libraries
are, given enough effort, portable to HelenOS.
Finally, we demonstrated the idea presented in the previous paragraph by porting
over 13 years old BSD game of {\tt tetris} to HelenOS. This particular version
of tetris looks almost the same both on other people's operating systems and on HelenOS.
Similar to {\tt libpci}, {\tt tetris} had to be modified in order to compile and run.
The filesystem calls were removed or replaced as well as references to terminal I/O
calls.
Assembler, linker and compiler are by all means the very focal point of attention
of all operating system projects. Quality of these tools influences
operating system performance and, what is more important, stability. HelenOS has
been tailored to build with GNU {\tt binutils}\cite{binutils} (i.e. the assembler and linker) and GNU~{\tt gcc}\cite{gcc}
(i.e. the compiler). There is only little chance that it could be compiled and
linked using some other tools unless those tools are compatible with the GNU build tools.
As our project declares support for five different processor architectures,
we needed to have five different flavors of the build utilities installed.
Interestingly, flavors of {\tt binutils} and {\tt gcc} for particular architecture
are not equal from the point of view of cross-binutils and cross-compiler installation.
All platforms except ia64 require only the {\tt binutils} package and the {\tt gcc} package
for the cross-tool to be built. On the other hand, ia64 requires also some excerpts from
the ia64-specific part of {\tt glibc}.
Formerly, the project could be compiled with almost any version of {\tt binutils} starting with 2.15
and {\tt gcc} starting with 2.95, but especially after we added partial thread local storage
support into our userspace layer, some architectures (e.g. mips32) will not compile even with {\tt gcc} 4.0.1
and demand {\tt gcc} 4.1.0 or newer.
As for the mips32 cross-compiler, {\OP} discovered a bug in {\tt gcc} (ticket \#23824) which caused {\tt gcc} to
incorrectly generate unaligned data access instructions (i.e. {\tt lwl}, {\tt lwr}, {\tt swl} and {\tt swr}).
As for the mips32 cross-binutils\footnote{It remains uninvestigated whether this problem also shows with other cross-tools.},
we observed that undefined symbols are not reported when we don't link using the standard target. We are still not
sure whether this was a bug --- {\tt binutils} developers just told us to use the standard target and then use
{\tt objcopy} to convert the ELF binary into requested output format.
After the build tools, simulators, emulators and virtualizers were the second focal point
in our project. These invaluable programs really sped the code-compile-test cycle.
In some cases, they were, and still are, the only option to actually run HelenOS on certain
processor architectures, because real hardware was not available to us. Using virtual environment
for developing our system provided us with deterministic environment on which it is much easier to do
troubleshooting. Moreover, part of the simulators featured integrated debugging facilities.
Without them, a lot of bugs would remain unresolved or even go unnoticed.
Using several virtual environments for testing one architecture is well justified by the
fact that sometimes HelenOS would run on two and crash on third or vice versa. Sometimes
we found that it runs on real hardware but fails in a simulator. The opposite case was,
however, more common. Simply put, the more configurations, no matter whether real or virtual,
the better.
From one point of view, we have tested our system on eight different virtual environments:
\begin{itemize}
\end{itemize}
From the second point of view, we have tested these programs by our operating system.
Because of the scope and uniqueness of this testing and because we did find some issues,
we want to dedicate some more space to what we have found.
Bochs\cite{bochs} has been used to develop the SPARTAN kernel since its beginning in 2001.
It is capable of emulating ia32 machine and for some time also amd64.
Bochs is an emulator and thus the slowest from virtual environments capable
of simulating the same cathegory of hardware. On the other hand, it is extremely
portable, compared to much faster virtualizers and emulators using dynamic translation
of instructions. Lately, there have been some plans to develop or port dynamic translation
to Bochs brewing in its developer community.
The biggest virtue of Bochs is that it has traditionally supported SMP. For some time, Bochs
has been our only environment on which we could develop and test SMP code. Unfortunatelly,
the quality of SMP support in Bochs was different from version to version. Because of SMP
breakage in Bochs, we had to avoid some versions thereof. So far, Bochs versions 2.2.1 and 2.2.6
have been best in this regard.
Our project has not only used Bochs. We also helped to identify some SMP related problems
and {\OP} from our team has discovered and also fixed a bug in FXSAVE and FXRSTOR emulation
(patch \#1282033).
Bochs has some debugging facilities but those have been very impractical and broken
in SMP mode. Moreover, it is possible to use the GNU debugger {\tt gbd} to connect to running
simulation, but this has also proven not very useful as we often needed to debug
problems that existed only in multiprocessor configurations, which {\tt gdb}
does not understand.
GXemul\cite{gxemul} is an emulator of several processor architectures. Nevertheless, we have
used it only for mips32 emulation in both little-endian and big-endian modes.
It seems to be pretty featurefull and evolving but we don't use all its functionality.
GXemul is very user friendly and has debugging features. It is more realistic
than msim. However, our newly introduced TLS support triggered a bug in the {\tt rdhwr}
instruction emulation while msim functioned as expected. Fortunatelly, the author
of GXemul is very cooperative and has fixed the problem for future versions as well as
provided a quick hack for the old version.
msim\cite{msim} has been our first mips32 simulator. It simulates 32-bit side of R4000 processor.
Its simulated environment is not very realistic, but the processor simulation
is good enough for operating system development. In this regard, the simulator is
comparable to HP's ia64 simulator Ski. Another similar aspect of these two is
relatively strong debugger.
Msim has been developed on the same alma mater as our own project.
All members of our team know this program from operating system courses.
Curiously, this simulator contained the biggest number of defects and inaccuracies
that we have ever discovered in a simulator. Fortunately, all of them have been
eventually fixed.
PearPC\cite{pearpc} is the only emulator on which we have run ppc32 port of HelenOS. It has
no debugging features, but fortunatelly its sources are available under
an open source license. This enabled {\OP} and {\MD} to alter its sources
in a way that this modified version allowed some basic debugging.
QEMU\cite{qemu} emulates several processor architectures. We have used it to emulate
ia32 and amd64. It can simulate SMP, but contrary to Bochs, it uses dynamic
translation of emulated instructions and performs much better because of
that.
This emulator seemed to realistically emulate the {\tt hlt} instruction,
which was nice for those of us who use notebooks as their development
machine.
Similar to Bochs, QEMU simulation can be aided by {\tt gdb}. Debugging
with {\tt gdb} can be pretty comfortable\footnote{Especially when the kernel is
compiled with {\tt -g3}.} until one needs to debug a SMP kernel running on multiple
processors.
Virtutech's Simics\cite{simics} simulator can be compared to a Swiss-army knife for operating system debugging.
This proprietary piece of software was available to us under an academic license for free.
Simics can be set to simulate many different configurations of many different machines.
It has the most advanced debugging features we have ever seen. To highlight some, its
memory access tracing ability has been really helpfull to us. During device driver
development, we appreciated the possibility to turn logging of the devices to a specified
verbosity.
We used it to test and develop amd64 and ia32 architectures in SMP mode and mips32 architecture in UP mode. Simics emulates the 4Kc processor on the MIPS architecture.
Unfortunately, this processor does not have an exception Reserved Instruction, which
makes it unusable in an environment with programs using thread local storage.
Regardless of its invaluable qualities, it has still contained bugs. One of the most
serious was bug with ticket \#3351. {\OP} discovered that its BIOS rewrites kernel memory
during application processors start. Another bugs found were related to amd64 and mips32.
As for amd64, Simics did not report general protection fault when {\tt EFER.NXE} was 0 and a non-executable
page was found (\#4214). As for mips32, Simics misemulated {\tt MSUB} and {\tt MSUBU} instructions.
The ia64 port of HelenOS has been developed and debugged on the HP's IA-64 Ski\cite{ski} simulator.
Ski is just an Itanium processor simulator and as such does not simulate a real machine. In fact, there
is no firmware and no configuration tables (e.g. memory map) present in Ski! On the other hand, the missing parts can be supplied externally\footnote{This
is actually how Linux runs in this simulator.}. The simulator provides means of interaction with
host system devices via Simulator SystemCalls (SSC). The simulator itself has graphical interface
with pretty powerful, but not as good as those of Simics, debugging facilities.
Ski is a proprietary program with no source code available. Its binaries are available
for free under a non-free license. It comes packaged with insufficient documentation
which makes the development pretty problematic. For instance, there is no public documentation
of all the SSC's. All one can do is to look into Linux/ia64-Ski port, which was written by the
same people as Ski, and use it as a refernce. We had to look into Linux once more when our kernel
started to fail in some memory-intensive stress tests. In fact, the problem was that the tests
hit the IA-32 legacy videoram area. We fixed the problem, in the light of absence of any memory map, by blacklisting
this piece of memory to our frame allocator.
The way HelenOS is booted on Ski is by simply loading its ELF image
and jumping to it. The ELF header contains two fields describing where and how to load the program image into memory:
VMA and LMA. VMA\footnote{Virtual Memory Address} is an address where the program's segment gets mapped in virtual memory.
LMA\footnote{Load Memory Address} is the physical address where the segment is loaded in memory. {\JV} discovered
that Ski confuses VMA and LMA. This, what we believe to be a bug in Ski, has not shown in Linux since Linux always has
LMA equal to VMA. People from the Ski mailing list had tried to help us but our repeated problem report didn't
make it far enough for the HP to fix or at least clarify the issue. Finally, we adopted a workaround implemented by {\JJ}
that simply swaps LMA and the program entry point in the kernel ELF image.
\subsection{VMware} VMware\cite{vmware} is the only virtualizer we have used in
HelenOS development. It virtualizes the ia32 host machine. Since VMware
version 5.5, we made use of its possibility to run the guest system
(i.e. HelenOS) on multiple processors. VMware has no support for
debugging but is very useful for compatibility and regression testing
because it's closest to the real hardware. VMware, being a virtualizer,
is also the fastest of all the virtual environments we have utilized.