Subversion Repositories HelenOS-doc

Rev

Rev 78 | Blame | Compare with Previous | Last modification | View Log | Download | RSS feed

  1. \chapter{Software}
  2. \label{tools}
  3.  
  4. During the development of the HelenOS operating system, we came across
  5. several types of software tools, programs, utilities and libraries.
  6. Some of the tools were used to develop the system itself while other tools
  7. were used to faciliate the development process. In some cases, we had a chance
  8. to try out several versions of the same product. Sometimes the new versions
  9. contained fixes for bugs we had discovered in previous versions thereof.
  10.  
  11. Another group of software we have used has been integrated into HelenOS
  12. to fill gaps after functionality that the genuine HelenOS code did
  13. not provide itself.
  14.  
  15. There is simply too much third party software that is somehow related to
  16. HelenOS to be covered all. This chapter attempts to present our experience
  17. with the key software tools, programs and libraries.
  18.  
  19. \section{Communication tools}
  20. Although the developers know each other in person, the development, with the
  21. exception of kernel camps, has been pretty much independent as far as locality
  22. and time goes. In order to work effectively, we have established several communication
  23. channels:
  24.  
  25. \begin{description}
  26. \item [E-mail] --- We used this basic means of electronic communication for peer-to-peer
  27. discussion in cases when the other person could not have been reached on-line at
  28. the time his advice was needed or his attention was demanded. E-mail was also
  29. used for contacting developers of third party software that we needed to talk to.
  30.  
  31. \item [Mailing list] --- As almost every open source project before us, also we opened
  32. mailing list for technical discussion. The advantage of having a mailing list is
  33. the fact that it enables multilateral discussions on several topics contemporarily,
  34. without the need for all the participants be on-line or even at one place. We have kept
  35. our first development mailing list closed to public so that it seemed natural to us
  36. to use Czech as our communication language on the list since Czech, with one exception,
  37. is our native language and all of us speak it very well. Besides all the advantages,
  38. there are also disadvantages. First, communication over mailing list tends to be rather
  39. slow, compared for instance to ICQ. Second, because of its implicit collective nature,
  40. it sometimes tends to be so slow that an answer for a given question never comes.
  41.  
  42. Apart from the internal development mailing list, we have also used another mailing list
  43. for commit log messages which proved handy in keeping developers informed about all changes in
  44. the repository.
  45.  
  46. Finally, we have also established a public mailing list for communication
  47. about general HelenOS topics in English.
  48.  
  49. \item [ICQ] --- Because we divided the whole project into smaller subprojects on which
  50. only the maximum of two people out of six would work together, the need for communication
  51. among all six people was significantly smaller than the need to communicate between the two
  52. developers who tightly cooperated on a specific task. For this reason, we made the biggest
  53. use of ICQ.
  54. \end{description}
  55.  
  56. \section{Concurrent versions systems}
  57. At the very beginning, when the SPARTAN kernel was being developed solely
  58. by \JJ, there was not much sence in using any software for management of
  59. concurrent versions. However, when the number of developers increased to six,
  60. we immediately started to think of available solutions.
  61.  
  62. We have begun with CVS because it is probably the best known file concurrent
  63. versions system. We have even had repository of HelenOS using CVS for a short time,
  64. but when we learned about its weaknesses we sought another solution. There are two
  65. weaknesses that have prevented us from using CVS:
  66.  
  67. \begin{itemize}
  68. \item it is merely a file concurrent versions system (i.e. CVS is
  69. good at managing versions of each separate file in the repository
  70. but has no clue about the project's directory tree as a whole;
  71. specifically renaming of a file while preserving its revision history
  72. is next to impossible),
  73.  
  74. \item it lacks atomic commits (i.e. should your commit conflict with
  75. another recent commit of another developer, CVS would not abort the whole operation
  76. but render the repository inconsistent instead).
  77. \end{itemize}
  78.  
  79. Being aware of these limitations, we decided to go with Subversion. Subversion
  80. is, simply put, a redesigned CVS with all the limitations fixed. We were
  81. already familiar with CVS so the switch to Subversion was pretty seamless.
  82.  
  83. As for Subversion itself, it has worked for us well and has met all our
  84. expectations. Despite all its pros, there was a serious problem that
  85. occurred sometime in the middle of the development process. Because of some locking
  86. issues related to the default database backend (i.e. {\tt Berkeley DB}),
  87. our Subversion repository put itself in a peculiar state in which it became
  88. effectivelly inaccessible by any means of standard usage or administration.
  89. To mitigate this problem, we had to manually delete orphaned file locks
  90. and switch to backend called {\tt fsfs} which doesn't suffer this
  91. problem.
  92.  
  93. Other than that, we are happy users of Subversion. The ability to switch
  94. the entire working copy to particular revision is a great feature
  95. for debugging. Once we tracked a bug three months into the past by
  96. moving through revisions until we found the change that caused the bug.
  97.  
  98. \section{Web tools}
  99. On our project website\cite{helenos}, we provided links to different
  100. web utilities that either functioned to access our Subversion repository
  101. or mailing list or provided another services:
  102.  
  103. \begin{description}
  104. \item [Chora] is a part of the Horde framework and can be used to comfortably
  105. browse Subversion repository from the web. We altered it a little bit to also
  106. show number of commits per developer on our homepage.
  107.  
  108. \item [Whups] is another component of the Horde framework. It provides
  109. feature request and bug tracking features. However, in the light of being rather
  110. closed group of people, we used this tool only seldomly. On the other hand,
  111. any possible beta tester of our operating system has had a chance to
  112. submit bug reports.
  113.  
  114. \item [Mailman] is a web interface to the mailing list we utilized. It allows
  115. to control subsriptions and search mailing list archives on-line.
  116. \end{description}
  117.  
  118. \section{Third party components of HelenOS}
  119. HelenOS itself contains third party software. In the first place, amd64 and ia32 architectures
  120. make use of the GNU Grub boot loader. This software replaced the original limited boot loader
  121. after the Kernel Camp 2005 when {\MD} had made HelenOS Multiboot specification compliant. Because of
  122. Grub, HelenOS can be booted from several types of devices. More importantly, we use
  123. Grub to load HelenOS userspace modules as well.
  124.  
  125. Another third-party piece of the HelenOS operating system is the userspace {\tt malloc()}.
  126. Rather than porting our kernel slab allocator to userspace, we have chosen Doug Lea's public
  127. domain {\tt dlmalloc} instead. This allocator could be easily integrated into our uspace tree
  128. and has proven itself in other projects as well. Its derivative, {\tt ptmalloc}, has been part of the
  129. GNU C library for some time. However, the version we are using is not optimized for SMP and multithreading.
  130. We plan to eventually replace it with another allocator.
  131.  
  132. Next, the {\tt pci} userspace task is using the {\tt libpci} library. The
  133. library was simplified and ported to HelenOS. Even though filesystem
  134. calls were removed from the library, it still heavily depends on {\tt libc}.
  135. By porting {\tt libpci} to HelenOS, we demonstrated that applications and libraries
  136. are, given enough effort, portable to HelenOS.
  137.  
  138. Finally, we demonstrated the idea presented in the previous paragraph by porting
  139. over 13 years old BSD game of {\tt tetris} to HelenOS. This particular version
  140. of tetris looks almost the same both on other people's operating systems and on HelenOS.
  141. Similar to {\tt libpci}, {\tt tetris} had to be modified in order to compile and run.
  142. The filesystem calls were removed or replaced as well as references to terminal I/O
  143. calls.
  144.  
  145. \section{Build tools}
  146. Assembler, linker and compiler are by all means the very focal point of attention
  147. of all operating system projects. Quality of these tools influences
  148. operating system performance and, what is more important, stability. HelenOS has
  149. been tailored to build with GNU {\tt binutils}\cite{binutils} (i.e. the assembler and linker) and GNU~{\tt gcc}\cite{gcc}
  150. (i.e. the compiler). There is only little chance that it could be compiled and
  151. linked using some other tools unless those tools are compatible with the GNU build tools.
  152.  
  153. As our project declares support for five different processor architectures,
  154. we needed to have five different flavors of the build utilities installed.
  155. Interestingly, flavors of {\tt binutils} and {\tt gcc} for particular architecture
  156. are not equal from the point of view of cross-binutils and cross-compiler installation.
  157. All platforms except ia64 require only the {\tt binutils} package and the {\tt gcc} package
  158. for the cross-tool to be built. On the other hand, ia64 requires also some excerpts from
  159. the ia64-specific part of {\tt glibc}.
  160.  
  161. Formerly, the project could be compiled with almost any version of {\tt binutils} starting with 2.15
  162. and {\tt gcc} starting with 2.95, but especially after we added partial thread local storage
  163. support into our userspace layer, some architectures (e.g. mips32) will not compile even with {\tt gcc} 4.0.1
  164. and demand {\tt gcc} 4.1.0 or newer.
  165.  
  166. As for the mips32 cross-compiler, {\OP} discovered a bug in {\tt gcc} (ticket \#23824) which caused {\tt gcc} to
  167. incorrectly generate unaligned data access instructions (i.e. {\tt lwl}, {\tt lwr}, {\tt swl} and {\tt swr}).
  168.  
  169. As for the mips32 cross-binutils\footnote{It remains uninvestigated whether this problem also shows with other cross-tools.},
  170. we observed that undefined symbols are not reported when we don't link using the standard target. We are still not
  171. sure whether this was a bug --- {\tt binutils} developers just told us to use the standard target and then use
  172. {\tt objcopy} to convert the ELF binary into requested output format.
  173.  
  174. \section{Virtual environments}
  175. After the build tools, simulators, emulators and virtualizers were the second focal point
  176. in our project. These invaluable programs really sped the code-compile-test cycle.
  177. In some cases, they were, and still are, the only option to actually run HelenOS on certain
  178. processor architectures, because real hardware was not available to us. Using virtual environment
  179. for developing our system provided us with deterministic environment on which it is much easier to do
  180. troubleshooting. Moreover, part of the simulators featured integrated debugging facilities.
  181. Without them, a lot of bugs would remain unresolved or even go unnoticed.
  182.  
  183. Using several virtual environments for testing one architecture is well justified by the
  184. fact that sometimes HelenOS would run on two and crash on third or vice versa. Sometimes
  185. we found that it runs on real hardware but fails in a simulator. The opposite case was,
  186. however, more common. Simply put, the more configurations, no matter whether real or virtual,
  187. the better.
  188.  
  189. From one point of view, we have tested our system on eight different virtual environments:
  190.  
  191. \begin{itemize}
  192. \item Bochs,
  193. \item GXemul,
  194. \item msim,
  195. \item PearPC,
  196. \item QEMU,
  197. \item Simics,
  198. \item Ski,
  199. \item VMware.
  200. \end{itemize}
  201.  
  202. From the second point of view, we have tested these programs by our operating system.
  203. Because of the scope and uniqueness of this testing and because we did find some issues,
  204. we want to dedicate some more space to what we have found.
  205.  
  206. \subsection{Bochs}
  207. Bochs\cite{bochs} has been used to develop the SPARTAN kernel since its beginning in 2001.
  208. It is capable of emulating ia32 machine and for some time also amd64.
  209. Bochs is an emulator and thus the slowest from virtual environments capable
  210. of simulating the same cathegory of hardware. On the other hand, it is extremely
  211. portable, compared to much faster virtualizers and emulators using dynamic translation
  212. of instructions. Lately, there have been some plans to develop or port dynamic translation
  213. to Bochs brewing in its developer community.
  214.  
  215. The biggest virtue of Bochs is that it has traditionally supported SMP. For some time, Bochs
  216. has been our only environment on which we could develop and test SMP code. Unfortunatelly,
  217. the quality of SMP support in Bochs was different from version to version. Because of SMP
  218. breakage in Bochs, we had to avoid some versions thereof. So far, Bochs versions 2.2.1 and 2.2.6
  219. have been best in this regard.
  220.  
  221. Our project has not only used Bochs. We also helped to identify some SMP related problems
  222. and {\OP} from our team has discovered and also fixed a bug in FXSAVE and FXRSTOR emulation
  223. (patch \#1282033).
  224.  
  225. Bochs has some debugging facilities but those have been very impractical and broken
  226. in SMP mode. Moreover, it is possible to use the GNU debugger {\tt gbd} to connect to running
  227. simulation, but this has also proven not very useful as we often needed to debug
  228. problems that existed only in multiprocessor configurations, which {\tt gdb}
  229. does not understand.
  230.  
  231. \subsection{GXemul}
  232. GXemul\cite{gxemul} is an emulator of several processor architectures. Nevertheless, we have
  233. used it only for mips32 emulation in both little-endian and big-endian modes.
  234. It seems to be pretty featurefull and evolving but we don't use all its functionality.
  235. GXemul is very user friendly and has debugging features. It is more realistic
  236. than msim. However, our newly introduced TLS support triggered a bug in the {\tt rdhwr}
  237. instruction emulation while msim functioned as expected. Fortunatelly, the author
  238. of GXemul is very cooperative and has fixed the problem for future versions as well as
  239. provided a quick hack for the old version.
  240.  
  241. \subsection{msim}
  242. msim\cite{msim} has been our first mips32 simulator. It simulates 32-bit side of R4000 processor.
  243. Its simulated environment is not very realistic, but the processor simulation
  244. is good enough for operating system development. In this regard, the simulator is
  245. comparable to HP's ia64 simulator Ski. Another similar aspect of these two is
  246. relatively strong debugger.
  247.  
  248. Msim has been developed on the same alma mater as our own project.
  249. All members of our team know this program from operating system courses.
  250. Curiously, this simulator contained the biggest number of defects and inaccuracies
  251. that we have ever discovered in a simulator.  Fortunately, all of them have been
  252. eventually fixed.
  253.  
  254. \subsection{PearPC}
  255. PearPC\cite{pearpc} is the only emulator on which we have run ppc32 port of HelenOS. It has
  256. no debugging features, but fortunatelly its sources are available under
  257. an open source license. This enabled {\OP} and {\MD} to alter its sources
  258. in a way that this modified version allowed some basic debugging.
  259.  
  260. \subsection{QEMU}
  261. QEMU\cite{qemu} emulates several processor architectures. We have used it to emulate
  262. ia32 and amd64. It can simulate SMP, but contrary to Bochs, it uses dynamic
  263. translation of emulated instructions and performs much better because of
  264. that.
  265.  
  266. This emulator seemed to realistically emulate the {\tt hlt} instruction,
  267. which was nice for those of us who use notebooks as their development
  268. machine.
  269.  
  270. Similar to Bochs, QEMU simulation can be aided by {\tt gdb}. Debugging
  271. with {\tt gdb} can be pretty comfortable\footnote{Especially when the kernel is
  272. compiled with {\tt -g3}.} until one needs to debug a SMP kernel running on multiple
  273. processors.
  274.  
  275. \subsection{Simics}
  276. Virtutech's Simics\cite{simics} simulator can be compared to a Swiss-army knife for operating system debugging.
  277. This proprietary piece of software was available to us under an academic license for free.
  278.  
  279. Simics can be set to simulate many different configurations of many different machines.
  280. It has the most advanced debugging features we have ever seen. To highlight some, its
  281. memory access tracing ability has been really helpfull to us. During device driver
  282. development, we appreciated the possibility to turn logging of the devices to a specified
  283. verbosity.
  284.  
  285. We used it to test and develop amd64 and ia32 architectures in SMP mode and mips32 architecture in UP mode. Simics emulates the 4Kc processor on the MIPS architecture.
  286. Unfortunately, this processor does not have an exception Reserved Instruction, which
  287. makes it unusable in an environment with programs using thread local storage.
  288.  
  289. Regardless of its invaluable qualities, it has still contained bugs. One of the most
  290. serious was bug with ticket \#3351. {\OP} discovered that its BIOS rewrites kernel memory
  291. during application processors start. Another bugs found were related to amd64 and mips32.
  292. As for amd64, Simics did not report general protection fault when {\tt EFER.NXE} was 0 and a non-executable
  293. page was found (\#4214). As for mips32, Simics misemulated {\tt MSUB} and {\tt MSUBU} instructions.
  294.  
  295. \subsection{Ski}
  296. The ia64 port of HelenOS has been developed and debugged on the HP's IA-64 Ski\cite{ski} simulator.
  297. Ski is just an Itanium processor simulator and as such does not simulate a real machine. In fact, there
  298. is no firmware and no configuration tables (e.g. memory map) present in Ski! On the other hand, the missing parts can be supplied externally\footnote{This
  299. is actually how Linux runs in this simulator.}. The simulator provides means of interaction with
  300. host system devices via Simulator SystemCalls (SSC). The simulator itself has graphical interface
  301. with pretty powerful, but not as good as those of Simics, debugging facilities.
  302.  
  303. Ski is a proprietary program with no source code available. Its binaries are available
  304. for free under a non-free license. It comes packaged with insufficient documentation
  305. which makes the development pretty problematic. For instance, there is no public documentation
  306. of all the SSC's. All one can do is to look into Linux/ia64-Ski port, which was written by the
  307. same people as Ski, and use it as a refernce. We had to look into Linux once more when our kernel
  308. started to fail in some memory-intensive stress tests. In fact, the problem was that the tests
  309. hit the IA-32 legacy videoram area. We fixed the problem, in the light of absence of any memory map, by blacklisting
  310. this piece of memory to our frame allocator.
  311.  
  312. The way HelenOS is booted on Ski is by simply loading its ELF image
  313. and jumping to it. The ELF header contains two fields describing where and how to load the program image into memory:
  314. VMA and LMA. VMA\footnote{Virtual Memory Address} is an address where the program's segment gets mapped in virtual memory.
  315. LMA\footnote{Load Memory Address} is the physical address where the segment is loaded in memory. {\JV} discovered
  316. that Ski confuses VMA and LMA. This, what we believe to be a bug in Ski, has not shown in Linux since Linux always has
  317. LMA equal to VMA. People from the Ski mailing list had tried to help us but our repeated problem report didn't
  318. make it far enough for the HP to fix or at least clarify the issue. Finally, we adopted a workaround implemented by {\JJ}
  319. that simply swaps LMA and the program entry point in the kernel ELF image.
  320.  
  321. \subsection{VMware} VMware\cite{vmware} is the only virtualizer we have used in
  322. HelenOS development. It virtualizes the ia32 host machine. Since VMware
  323. version 5.5, we made use of its possibility to run the guest system
  324. (i.e. HelenOS) on multiple processors. VMware has no support for
  325. debugging but is very useful for compatibility and regression testing
  326. because it's closest to the real hardware. VMware, being a virtualizer,
  327. is also the fastest of all the virtual environments we have utilized.
  328.  
  329.  
  330.