Rev 36 | Rev 51 | Go to most recent revision | Details | Compare with Previous | Last modification | View Log | RSS feed
| Rev | Author | Line No. | Line |
|---|---|---|---|
| 30 | jermar | 1 | \chapter{Software} |
| 2 | \label{tools} |
||
| 3 | |||
| 4 | During the development of the HelenOS operating system, we came across |
||
| 5 | several types of software tools, programs, utilities and libraries. |
||
| 6 | Some of the tools were used to develop the system itself while other tools |
||
| 7 | were used to faciliate the development process. In some cases, we had a chance |
||
| 8 | to try out several versions of the same product. Sometimes the new versions |
||
| 9 | contained fixes for bugs we had discovered in previous versions thereof. |
||
| 10 | |||
| 11 | Another group of software we have used has been integrated into HelenOS |
||
| 12 | to fill gaps after functionality that the genuine HelenOS code did |
||
| 13 | not provide itself. |
||
| 14 | |||
| 15 | There is simply too much third party software that is somehow related to |
||
| 16 | HelenOS to be covered all. This chapter attempts to present our experience |
||
| 17 | with the key softare tools, programs and libraries. |
||
| 18 | |||
| 19 | \section{Communication tools} |
||
| 20 | Although the developers know each other in person, the development, with the |
||
| 21 | exception of kernel camps, has been pretty much independent as far as locality |
||
| 22 | and time goes. In order to work effectively, we have established several communication |
||
| 23 | channels: |
||
| 24 | |||
| 25 | \begin{description} |
||
| 26 | \item [E-mail] --- We used this basic means of electronic communication for peer-to-peer |
||
| 27 | discussion in cases when the other person could not have been reached on-line at |
||
| 28 | the time his advice was needed or his attention was demanded. E-mail was also |
||
| 29 | used for contacting developers of third party software that we needed to talk to. |
||
| 30 | |||
| 31 | \item [Mailing list] --- As almost every open source project before us, also we opened |
||
| 32 | mailing list for technical discussion. The advantage of having a mailing list is |
||
| 33 | the fact that it enables multilateral discussions on several topics contemporarily, |
||
| 34 | without the need for all the participants be on-line or even at one place. We have kept |
||
| 35 | our first development mailing list closed to public so that it seemed natural to us |
||
| 36 | to use Czech as our communication language on the list since Czech, with one exception, |
||
| 37 | is our native language and all of us speak it very well. Besides all the advantages, |
||
| 38 | there are also disadvantages. First, communication over mailing list tends to be rather |
||
| 39 | slow, compared for instance to ICQ. Second, because of its implicit collective nature, |
||
| 40 | it sometimes tends to be so slow that an answer for a given question never comes. |
||
| 41 | |||
| 42 | Apart from the internal development mailing list, we have also used another mailing list |
||
| 43 | for commit log messages which proved handy in keeping developers informed about all changes in |
||
| 44 | the repository. |
||
| 45 | |||
| 46 | Finally, we have also established a public mailing list for communication |
||
| 47 | about general HelenOS topics in English. |
||
| 48 | |||
| 49 | \item [ICQ] --- Because we divided the whole project into smaller subprojects on which |
||
| 50 | only the maximum of two people out of six would work together, the need for communication |
||
| 51 | among all six people was significantly smaller than the need to communicate between the two |
||
| 52 | developers who tightly cooperated on a specific task. For this reason, we made the biggest |
||
| 53 | use of ICQ. |
||
| 54 | \end{description} |
||
| 55 | |||
| 56 | \section{Concurrent versions systems} |
||
| 57 | At the very beginning, when the SPARTAN kernel was being developed solely |
||
| 58 | by \JJ, there was not much sence in using any software for management of |
||
| 59 | concurrent versions. However, when the number of developers increased to six, |
||
| 60 | we immediately started to think of available solutions. |
||
| 61 | |||
| 62 | We have begun with CVS because it is probably the best known file concurrent |
||
| 63 | versions system. We have even had repository of HelenOS using CVS for a short time, |
||
| 64 | but when we learned about its weaknesses we sought another solution. There are two |
||
| 65 | weaknesses that have prevented us from using CVS: |
||
| 66 | |||
| 67 | \begin{itemize} |
||
| 68 | \item it is merely a file concurrent versions system (i.e. CVS is |
||
| 69 | good at managing versions of each separate file in the repository |
||
| 70 | but has no clue about the project's directory tree as a whole; |
||
| 71 | specifically renaming of a file while preserving its revision history |
||
| 72 | is next to impossible), |
||
| 73 | |||
| 74 | \item it lacks atomic commits (i.e. should your commit conflict with |
||
| 75 | another recent commit of another developer, CVS would not abort the whole operation |
||
| 76 | but render the repository inconsistent instead). |
||
| 77 | \end{itemize} |
||
| 78 | |||
| 79 | Being aware of these limitations, we decided to go with Subversion. Subversion |
||
| 80 | is, simply put, a redesigned CVS with all the limitations fixed. We were |
||
| 81 | already familiar with CVS so the switch to Subversion was pretty seamless. |
||
| 82 | |||
| 83 | As for Subversion itself, it has worked for us well and has met all our |
||
| 84 | expectations. Despite all its pros, there was a serious problem that |
||
| 85 | occurred sometime in the middle of the development process. Because of some locking |
||
| 86 | issues related to the default database backend (i.e. {\tt Berkeley DB}), |
||
| 87 | our Subversion repository put itself in a peculiar state in which it became |
||
| 88 | effectivelly inaccessible by any means of standard usage or administration. |
||
| 89 | To mitigate this problem, we had to manually delete orphaned file locks |
||
| 90 | and switch to backend called {\tt fsfs} which doesn't suffer this |
||
| 91 | problem. |
||
| 92 | |||
| 93 | Other than that, we are happy users of Subversion. The ability to switch |
||
| 94 | the entire working copy to particular revision is a great feature |
||
| 95 | for debugging. Once we tracked a bug three months into the past by |
||
| 96 | moving through revisions until we found the change that caused the bug. |
||
| 97 | |||
| 98 | \section{Web tools} |
||
| 99 | On our project website\cite{helenos}, we provided links to different |
||
| 100 | web utilities that either functioned to access our Subversion repository |
||
| 101 | or mailing list or provided another services: |
||
| 102 | |||
| 103 | \begin{description} |
||
| 104 | \item [Chora] is a part of the Horde framework and can be used to comfortably |
||
| 105 | browse Subversion repository from the web. We altered it a little bit to also |
||
| 106 | show number of commits per developer on our homepage. |
||
| 107 | |||
| 33 | jermar | 108 | \item [Whups] is another component of the Horde framework. It provides |
| 30 | jermar | 109 | feature request and bug tracking features. However, in the light of being rather |
| 110 | closed group of people, we used this tool only seldomly. On the other hand, |
||
| 111 | any possible beta tester of our operating system has had a chance to |
||
| 112 | submit bug reports. |
||
| 113 | |||
| 114 | \item [Mailman] is a web interface to the mailing list we utilized. It allows |
||
| 115 | to control subsriptions and search mailing list archives on-line. |
||
| 116 | \end{description} |
||
| 117 | |||
| 118 | \section{Third party components of HelenOS} |
||
| 119 | HelenOS itself contains third party software. In the first place, amd64 and ia32 architectures |
||
| 120 | make use of GNU Grub boot loader. This software replaced the original limited boot loader |
||
| 121 | after the Kernel Camp 2005 when {\MD} had made HelenOS Multiboot specification compliant. Because of |
||
| 122 | Grub, HelenOS can be booted from several types of devices. More importantly, we use |
||
| 123 | Grub to load HelenOS userspace modules as well. |
||
| 124 | |||
| 125 | Another third-party piece of the HelenOS operating system is the userspace {\tt malloc()}. |
||
| 126 | Rather than porting our kernel slab allocator to userspace, we have chosen Doug Lea's public |
||
| 127 | domain {\tt dlmalloc} instead. This allocator could be easily integrated into our uspace tree |
||
| 128 | and has proven itself in other projects as well. Its derivative, {\tt ptmalloc}, has been part of the |
||
| 129 | GNU C library for some time. However, the version we are using is not optimized for SMP and multithreading. |
||
| 130 | We plan to eventually replace it with another allocator. |
||
| 131 | |||
| 36 | jermar | 132 | Finally, the {\tt pci} userspace task is using the {\tt libpci} library. The |
| 133 | library was simplified and ported to HelenOS. Even though filesystem |
||
| 134 | calls were removed from the library, it still heavily depends on {\tt libc}. |
||
| 135 | By porting {\tt libpci} to HelenOS, we demonstrated that applications and libraries |
||
| 136 | are, given enough effort, portable to HelenOS. |
||
| 137 | |||
| 30 | jermar | 138 | \section{Build tools} |
| 139 | Assembler, linker and compiler are by all means the very focal point of attention |
||
| 140 | of all operating system projects. Quality of these tools influences |
||
| 141 | operating system performance and, what is more important, stability. HelenOS has |
||
| 142 | been tailored to build with GNU {\tt binutils} (i.e. the assembler and linker) and GNU~{\tt gcc} |
||
| 143 | (i.e. the compiler). There is only little chance that it could be compiled and |
||
| 144 | linked using some other tools unless those tools are compatible with the GNU build tools. |
||
| 145 | |||
| 146 | As our project declares support for five different processor architectures, |
||
| 147 | we needed to have five different flavors of the build utilities installed. |
||
| 148 | Interestingly, flavors of {\tt binutils} and {\tt gcc} for particular architecture |
||
| 149 | are not equal from the point of view of cross-binutils and cross-compiler installation. |
||
| 150 | All platforms except ia64 require only the {\tt binutils} package and the {\tt gcc} package |
||
| 151 | for the cross-tool to be built. On the other hand, ia64 requires also some excerpts from |
||
| 152 | the ia64-specific part of {\tt glibc}. |
||
| 153 | |||
| 154 | Formerly, the project could be compiled with almost any version of {\tt binutils} starting with 2.15 |
||
| 155 | and {\tt gcc} starting with 2.95, but especially after we added partial thread local storage |
||
| 156 | support into our userspace layer, some architectures (e.g. mips32) will not compile even with {\tt gcc} 4.0.1 |
||
| 157 | and demand {\tt gcc} 4.1.0. Curiously, ia64 will not link when compiled with {\tt gcc} 4.1.0. |
||
| 158 | |||
| 159 | As for the mips32 cross-compiler, {\OP} discovered a bug in {\tt gcc} (ticket \#23824) which caused {\tt gcc} to |
||
| 160 | incorrectly generate unaligned data access instructions (i.e. {\tt lwl}, {\tt lwr}, {\tt swl} and {\tt swr}). |
||
| 161 | |||
| 162 | As for the mips32 cross-binutils\footnote{It remains uninvestigated whether this problem also shows with other cross-tools.}, |
||
| 163 | we observed that undefined symbols are not reported when we don't link using the standard target. We are still not |
||
| 164 | sure whether this was a bug --- {\tt binutils} developers just told us to use the standard target and then use |
||
| 165 | {\tt objcopy} to convert the ELF binary into requested output format. |
||
| 166 | |||
| 167 | \section{Virtual environments} |
||
| 168 | After the build tools, simulators, emulators and virtualizers were the second focal point |
||
| 169 | in our project. These invaluable programs really sped the code-compile-test cycle. |
||
| 170 | In some cases, they were, and still are, the only option to actually run HelenOS on certain |
||
| 171 | processor architectures, because real hardware was not available to us. Using virtual environment |
||
| 33 | jermar | 172 | for developing our system provided us with deterministic environment on which it is much easier to do |
| 30 | jermar | 173 | troubleshooting. Moreover, part of the simulators featured integrated debugging facilities. |
| 174 | Without them, a lot of bugs would remain unresolved or even go unnoticed. |
||
| 175 | |||
| 42 | jermar | 176 | Using several virtual environments for testing one architecture is well justified by the |
| 177 | fact that sometimes HelenOS would run on two and crash on third or vice versa. Sometimes |
||
| 178 | we found that it runs on real hardware but fails in a simulator. The opposite case was, |
||
| 179 | however, more common. Simply put, the more configurations, no matter whether real or virtual, |
||
| 180 | the better. |
||
| 181 | |||
| 30 | jermar | 182 | From one point of view, we have tested our system on eight different virtual environments: |
| 183 | |||
| 184 | \begin{itemize} |
||
| 185 | \item Bochs, |
||
| 186 | \item GXemul, |
||
| 187 | \item msim, |
||
| 188 | \item PearPC, |
||
| 189 | \item QEMU, |
||
| 190 | \item Simics, |
||
| 191 | \item Ski, |
||
| 192 | \item VMware. |
||
| 193 | \end{itemize} |
||
| 194 | |||
| 195 | From the second point of view, we have tested these programs by our operating system. |
||
| 196 | Because of the scope and uniqueness of this testing and because we did find some issues, |
||
| 197 | we want to dedicate some more space to what we have found. |
||
| 198 | |||
| 199 | \subsection{Bochs} |
||
| 200 | Bochs has been used to develop the SPARTAN kernel since its beginning in 2001. |
||
| 201 | It is capable of emulating ia32 machine and for some time also amd64. |
||
| 202 | Bochs is an emulator and thus the slowest from virtual environments capable |
||
| 203 | of simulating the same cathegory of hardware. On the other hand, it is extremely |
||
| 204 | portable, compared to much faster virtualizers and emulators using dynamic translation |
||
| 205 | of instructions. Lately, there have been some plans to develop or port dynamic translation |
||
| 206 | to Bochs brewing in its developer community. |
||
| 207 | |||
| 208 | The biggest virtue of Bochs is that it has traditionally supported SMP. For some time, Bochs |
||
| 33 | jermar | 209 | has been our only environment on which we could develop and test SMP code. Unfortunatelly, |
| 30 | jermar | 210 | the quality of SMP support in Bochs was different from version to version. Because of SMP |
| 211 | breakage in Bochs, we had to avoid some versions thereof. So far, Bochs versions 2.2.1 and 2.2.6 |
||
| 212 | have been best in this regard. |
||
| 213 | |||
| 214 | Our project has not only used Bochs. We also helped to identify some SMP related problems |
||
| 215 | and {\OP} from our team has discovered and also fixed a bug in FXSAVE and FXRSTOR emulation |
||
| 216 | (patch \#1282033). |
||
| 217 | |||
| 218 | Bochs has some debugging facilities but those have been very impractical and broken |
||
| 42 | jermar | 219 | in SMP mode. Moreover, it is possible to use the GNU debugger {\tt gbd} to connect to running |
| 220 | simulation, but this has also proven not very useful as we often needed to debug |
||
| 221 | problems that existed only in multiprocessor configurations, which {\tt gdb} |
||
| 222 | does not understand. |
||
| 30 | jermar | 223 | |
| 224 | \subsection{GXemul} |
||
| 225 | GXemul is an emulator of several processor architectures. Nevertheless, we have |
||
| 226 | used it only for mips32 emulation in both little-endian and big-endian modes. |
||
| 227 | It seems to be pretty featurefull and evolving but we don't use all its functionality. |
||
| 228 | GXemul is very user friendly and has debugging features. It is more realistic |
||
| 229 | than msim. However, our newly introduced TLS support triggered a bug in the {\tt rdhwr} |
||
| 230 | instruction emulation while msim functioned as expected. Fortunatelly, the author |
||
| 231 | of GXemul is very cooperative and has fixed the problem for future versions as well as |
||
| 232 | provided a quick hack for the old version. |
||
| 233 | |||
| 234 | \subsection{msim} |
||
| 235 | msim has been our first mips32 simulator. It simulates 32-bit side of R4000 processor. |
||
| 236 | Its simulated environment is not very realistic, but the processor simulation |
||
| 237 | is good enough for operating system development. In this regard, the simulator is |
||
| 238 | comparable to HP's ia64 simulator Ski. Another similar aspect of these two is |
||
| 239 | relatively strong debugger. |
||
| 240 | |||
| 241 | Msim has been developed on the same alma mater as our own project. |
||
| 242 | All members of our team know this program from operating system courses. |
||
| 243 | Curiously, this simulator contained the biggest number of defects and inaccuracies |
||
| 244 | that we have ever discovered in a simulator. Fortunately, all of them have been |
||
| 245 | eventually fixed. |
||
| 246 | |||
| 247 | \subsection{PearPC} |
||
| 33 | jermar | 248 | PearPC is the only emulator on which we have run ppc32 port of HelenOS. It has |
| 30 | jermar | 249 | no debugging features, but fortunatelly its sources are available under |
| 250 | an open source license. This enabled {\OP} and {\MD} to alter its sources |
||
| 251 | in a way that this modified version allowed some basic debugging. |
||
| 252 | |||
| 253 | \subsection{QEMU} |
||
| 254 | QEMU emulates several processor architectures. We have used it to emulate |
||
| 255 | ia32 and amd64. It can simulate SMP, but contrary to Bochs, it uses dynamic |
||
| 256 | translation of emulated instructions and performs much better because of |
||
| 257 | that. |
||
| 258 | |||
| 32 | jermar | 259 | This emulator seemed to realistically emulate the {\tt hlt} instruction, |
| 260 | which was nice for those of us who use notebooks as their development |
||
| 261 | machine. |
||
| 262 | |||
| 42 | jermar | 263 | Similar to Bochs, QEMU simulation can be aided by {\tt gdb}. Debugging |
| 264 | with {\tt gdb} can be pretty comfortable\footnote{Especially when the kernel is |
||
| 265 | compiled with {\tt -g3}.} until one needs to debug a SMP kernel running on multiple |
||
| 266 | processors. |
||
| 267 | |||
| 30 | jermar | 268 | \subsection{Simics} |
| 33 | jermar | 269 | Virtutech's Simics simulator can be compared to a Swiss-army knife for operating system debugging. |
| 32 | jermar | 270 | This proprietary piece of software was available to us under an academic license for free. |
| 271 | |||
| 272 | Simics can be set to simulate many different configurations of many different machines. |
||
| 273 | It has the most advanced debugging features we have ever seen. To highlight some, its |
||
| 274 | memory access tracing ability has been really helpfull to us. During device driver |
||
| 275 | development, we appreciated the possibility to turn logging of the devices to a specified |
||
| 276 | verbosity. |
||
| 277 | |||
| 278 | We used it to test and develop amd64 and ia32 architectures in SMP mode and mips32 architecture in UP mode. |
||
| 279 | |||
| 280 | Regardless of its invaluable qualities, it has still contained bugs. One of the most |
||
| 281 | serious was bug with ticket \#3351. {\OP} discovered that its BIOS rewrites kernel memory |
||
| 282 | during application processors start. Another bugs found were related to amd64 and mips32. |
||
| 283 | As for amd64, Simics did not report general protection fault when {\tt EFER.NXE} was 0 and a non-executable |
||
| 284 | page was found (\#4214). As for mips32, Simics misemulated {\tt MSUB} and {\tt MSUBU} instructions. |
||
| 285 | |||
| 30 | jermar | 286 | \subsection{Ski} |
| 33 | jermar | 287 | The ia64 port of HelenOS has been developed and debugged on the HP's IA-64 Ski simulator. |
| 288 | Ski is just an Itanium processor simulator and as such does not simulate a real machine. In fact, there |
||
| 289 | is no firmware and no configuration tables (e.g. memory map) present in Ski! On the other hand, the missing parts can be supplied externally\footnote{This |
||
| 290 | is actually how Linux runs in this simulator.}. The simulator provides means of interaction with |
||
| 291 | host system devices via Simulator SystemCalls (SSC). The simulator itself has graphical interface |
||
| 292 | with pretty powerful, but not as good as those of Simics, debugging facilities. |
||
| 293 | |||
| 294 | Ski is a proprietary program with no source code available. Its binaries are available |
||
| 295 | for free under a non-free license. It comes packaged with insufficient documentation |
||
| 296 | which makes the development pretty problematic. For instance, there is no public documentation |
||
| 297 | of all the SSC's. All one can do is to look into Linux/ia64-Ski port, which was written by the |
||
| 298 | same people as Ski, and use it as a refernce. We had to look into Linux once more when our kernel |
||
| 299 | started to fail in some memory-intensive stress tests. In fact, the problem was that the tests |
||
| 300 | hit the IA-32 legacy videoram area. We fixed the problem, in the light of absence of any memory map, by blacklisting |
||
| 301 | this piece of memory to our frame allocator. |
||
| 302 | |||
| 303 | The way HelenOS is booted on Ski is by simply loading its ELF image |
||
| 304 | and jumping to it. The ELF header contains two fields describing where and how to load the program image into memory: |
||
| 305 | VMA and LMA. VMA\footnote{Virtual Memory Address} is an address where the program's segment gets mapped in virtual memory. |
||
| 306 | LMA\footnote{Load Memory Address} is the physical address where the segment is loaded in memory. {\JV} discovered |
||
| 307 | that Ski confuses VMA and LMA. This, what we believe to be a bug in Ski, has not shown in Linux since Linux always has |
||
| 308 | LMA equal to VMA. People from the Ski mailing list had tried to help us but our repeated problem report didn't |
||
| 309 | make it far enough for the HP to fix or at least clarify the issue. Finally, we adopted a workaround implemented by {\JJ} |
||
| 310 | that simply swaps LMA and the program entry point in the kernel ELF image. |
||
| 311 | |||
| 42 | jermar | 312 | \subsection{VMware} VMware is the only virtualizer we have used in |
| 313 | HelenOS development. It virtualizes the ia32 host machine. Since VMware |
||
| 314 | version 5.5, we made use of its possibility to run the guest system |
||
| 315 | (i.e. HelenOS) on multiple processors. VMware has no support for |
||
| 316 | debugging but is very useful for compatibility and regression testing |
||
| 317 | because it's closest to the real hardware. VMware, being a virtualizer, |
||
| 318 | is also the fastest of all the virtual environments we have utilized. |
||
| 30 | jermar | 319 | |
| 33 | jermar | 320 |