Rev 78 | Go to most recent revision | Details | Compare with Previous | Last modification | View Log | RSS feed
Rev | Author | Line No. | Line |
---|---|---|---|
30 | jermar | 1 | \chapter{Software} |
2 | \label{tools} |
||
3 | |||
4 | During the development of the HelenOS operating system, we came across |
||
5 | several types of software tools, programs, utilities and libraries. |
||
6 | Some of the tools were used to develop the system itself while other tools |
||
7 | were used to faciliate the development process. In some cases, we had a chance |
||
8 | to try out several versions of the same product. Sometimes the new versions |
||
9 | contained fixes for bugs we had discovered in previous versions thereof. |
||
10 | |||
11 | Another group of software we have used has been integrated into HelenOS |
||
12 | to fill gaps after functionality that the genuine HelenOS code did |
||
13 | not provide itself. |
||
14 | |||
15 | There is simply too much third party software that is somehow related to |
||
16 | HelenOS to be covered all. This chapter attempts to present our experience |
||
78 | jermar | 17 | with the key software tools, programs and libraries. |
30 | jermar | 18 | |
19 | \section{Communication tools} |
||
20 | Although the developers know each other in person, the development, with the |
||
21 | exception of kernel camps, has been pretty much independent as far as locality |
||
22 | and time goes. In order to work effectively, we have established several communication |
||
23 | channels: |
||
24 | |||
25 | \begin{description} |
||
26 | \item [E-mail] --- We used this basic means of electronic communication for peer-to-peer |
||
27 | discussion in cases when the other person could not have been reached on-line at |
||
28 | the time his advice was needed or his attention was demanded. E-mail was also |
||
29 | used for contacting developers of third party software that we needed to talk to. |
||
30 | |||
31 | \item [Mailing list] --- As almost every open source project before us, also we opened |
||
32 | mailing list for technical discussion. The advantage of having a mailing list is |
||
33 | the fact that it enables multilateral discussions on several topics contemporarily, |
||
34 | without the need for all the participants be on-line or even at one place. We have kept |
||
35 | our first development mailing list closed to public so that it seemed natural to us |
||
36 | to use Czech as our communication language on the list since Czech, with one exception, |
||
37 | is our native language and all of us speak it very well. Besides all the advantages, |
||
38 | there are also disadvantages. First, communication over mailing list tends to be rather |
||
39 | slow, compared for instance to ICQ. Second, because of its implicit collective nature, |
||
40 | it sometimes tends to be so slow that an answer for a given question never comes. |
||
41 | |||
42 | Apart from the internal development mailing list, we have also used another mailing list |
||
43 | for commit log messages which proved handy in keeping developers informed about all changes in |
||
44 | the repository. |
||
45 | |||
46 | Finally, we have also established a public mailing list for communication |
||
47 | about general HelenOS topics in English. |
||
48 | |||
49 | \item [ICQ] --- Because we divided the whole project into smaller subprojects on which |
||
50 | only the maximum of two people out of six would work together, the need for communication |
||
51 | among all six people was significantly smaller than the need to communicate between the two |
||
52 | developers who tightly cooperated on a specific task. For this reason, we made the biggest |
||
53 | use of ICQ. |
||
54 | \end{description} |
||
55 | |||
56 | \section{Concurrent versions systems} |
||
57 | At the very beginning, when the SPARTAN kernel was being developed solely |
||
58 | by \JJ, there was not much sence in using any software for management of |
||
59 | concurrent versions. However, when the number of developers increased to six, |
||
60 | we immediately started to think of available solutions. |
||
61 | |||
62 | We have begun with CVS because it is probably the best known file concurrent |
||
63 | versions system. We have even had repository of HelenOS using CVS for a short time, |
||
64 | but when we learned about its weaknesses we sought another solution. There are two |
||
65 | weaknesses that have prevented us from using CVS: |
||
66 | |||
67 | \begin{itemize} |
||
68 | \item it is merely a file concurrent versions system (i.e. CVS is |
||
69 | good at managing versions of each separate file in the repository |
||
70 | but has no clue about the project's directory tree as a whole; |
||
71 | specifically renaming of a file while preserving its revision history |
||
72 | is next to impossible), |
||
73 | |||
74 | \item it lacks atomic commits (i.e. should your commit conflict with |
||
75 | another recent commit of another developer, CVS would not abort the whole operation |
||
76 | but render the repository inconsistent instead). |
||
77 | \end{itemize} |
||
78 | |||
79 | Being aware of these limitations, we decided to go with Subversion. Subversion |
||
80 | is, simply put, a redesigned CVS with all the limitations fixed. We were |
||
81 | already familiar with CVS so the switch to Subversion was pretty seamless. |
||
82 | |||
83 | As for Subversion itself, it has worked for us well and has met all our |
||
84 | expectations. Despite all its pros, there was a serious problem that |
||
85 | occurred sometime in the middle of the development process. Because of some locking |
||
86 | issues related to the default database backend (i.e. {\tt Berkeley DB}), |
||
87 | our Subversion repository put itself in a peculiar state in which it became |
||
88 | effectivelly inaccessible by any means of standard usage or administration. |
||
89 | To mitigate this problem, we had to manually delete orphaned file locks |
||
90 | and switch to backend called {\tt fsfs} which doesn't suffer this |
||
91 | problem. |
||
92 | |||
93 | Other than that, we are happy users of Subversion. The ability to switch |
||
94 | the entire working copy to particular revision is a great feature |
||
95 | for debugging. Once we tracked a bug three months into the past by |
||
96 | moving through revisions until we found the change that caused the bug. |
||
97 | |||
98 | \section{Web tools} |
||
99 | On our project website\cite{helenos}, we provided links to different |
||
100 | web utilities that either functioned to access our Subversion repository |
||
101 | or mailing list or provided another services: |
||
102 | |||
103 | \begin{description} |
||
104 | \item [Chora] is a part of the Horde framework and can be used to comfortably |
||
105 | browse Subversion repository from the web. We altered it a little bit to also |
||
106 | show number of commits per developer on our homepage. |
||
107 | |||
33 | jermar | 108 | \item [Whups] is another component of the Horde framework. It provides |
30 | jermar | 109 | feature request and bug tracking features. However, in the light of being rather |
110 | closed group of people, we used this tool only seldomly. On the other hand, |
||
111 | any possible beta tester of our operating system has had a chance to |
||
112 | submit bug reports. |
||
113 | |||
114 | \item [Mailman] is a web interface to the mailing list we utilized. It allows |
||
115 | to control subsriptions and search mailing list archives on-line. |
||
116 | \end{description} |
||
117 | |||
118 | \section{Third party components of HelenOS} |
||
119 | HelenOS itself contains third party software. In the first place, amd64 and ia32 architectures |
||
78 | jermar | 120 | make use of the GNU Grub boot loader. This software replaced the original limited boot loader |
30 | jermar | 121 | after the Kernel Camp 2005 when {\MD} had made HelenOS Multiboot specification compliant. Because of |
122 | Grub, HelenOS can be booted from several types of devices. More importantly, we use |
||
123 | Grub to load HelenOS userspace modules as well. |
||
124 | |||
125 | Another third-party piece of the HelenOS operating system is the userspace {\tt malloc()}. |
||
126 | Rather than porting our kernel slab allocator to userspace, we have chosen Doug Lea's public |
||
127 | domain {\tt dlmalloc} instead. This allocator could be easily integrated into our uspace tree |
||
128 | and has proven itself in other projects as well. Its derivative, {\tt ptmalloc}, has been part of the |
||
129 | GNU C library for some time. However, the version we are using is not optimized for SMP and multithreading. |
||
130 | We plan to eventually replace it with another allocator. |
||
131 | |||
78 | jermar | 132 | Next, the {\tt pci} userspace task is using the {\tt libpci} library. The |
36 | jermar | 133 | library was simplified and ported to HelenOS. Even though filesystem |
134 | calls were removed from the library, it still heavily depends on {\tt libc}. |
||
135 | By porting {\tt libpci} to HelenOS, we demonstrated that applications and libraries |
||
136 | are, given enough effort, portable to HelenOS. |
||
137 | |||
78 | jermar | 138 | Finally, we demonstrated the idea presented in the previous paragraph by porting |
139 | over 13 years old BSD game of {\tt tetris} to HelenOS. This particular version |
||
140 | of tetris looks almost the same both on other people's operating systems and on HelenOS. |
||
141 | Similar to {\tt libpci}, {\tt tetris} had to be modified in order to compile and run. |
||
142 | The filesystem calls were removed or replaced as well as references to terminal I/O |
||
143 | calls. |
||
144 | |||
30 | jermar | 145 | \section{Build tools} |
146 | Assembler, linker and compiler are by all means the very focal point of attention |
||
147 | of all operating system projects. Quality of these tools influences |
||
148 | operating system performance and, what is more important, stability. HelenOS has |
||
78 | jermar | 149 | been tailored to build with GNU {\tt binutils}\cite{binutils} (i.e. the assembler and linker) and GNU~{\tt gcc}\cite{gcc} |
30 | jermar | 150 | (i.e. the compiler). There is only little chance that it could be compiled and |
151 | linked using some other tools unless those tools are compatible with the GNU build tools. |
||
152 | |||
153 | As our project declares support for five different processor architectures, |
||
154 | we needed to have five different flavors of the build utilities installed. |
||
155 | Interestingly, flavors of {\tt binutils} and {\tt gcc} for particular architecture |
||
156 | are not equal from the point of view of cross-binutils and cross-compiler installation. |
||
157 | All platforms except ia64 require only the {\tt binutils} package and the {\tt gcc} package |
||
158 | for the cross-tool to be built. On the other hand, ia64 requires also some excerpts from |
||
159 | the ia64-specific part of {\tt glibc}. |
||
160 | |||
161 | Formerly, the project could be compiled with almost any version of {\tt binutils} starting with 2.15 |
||
162 | and {\tt gcc} starting with 2.95, but especially after we added partial thread local storage |
||
163 | support into our userspace layer, some architectures (e.g. mips32) will not compile even with {\tt gcc} 4.0.1 |
||
51 | jermar | 164 | and demand {\tt gcc} 4.1.0 or newer. |
30 | jermar | 165 | |
166 | As for the mips32 cross-compiler, {\OP} discovered a bug in {\tt gcc} (ticket \#23824) which caused {\tt gcc} to |
||
167 | incorrectly generate unaligned data access instructions (i.e. {\tt lwl}, {\tt lwr}, {\tt swl} and {\tt swr}). |
||
168 | |||
169 | As for the mips32 cross-binutils\footnote{It remains uninvestigated whether this problem also shows with other cross-tools.}, |
||
170 | we observed that undefined symbols are not reported when we don't link using the standard target. We are still not |
||
171 | sure whether this was a bug --- {\tt binutils} developers just told us to use the standard target and then use |
||
172 | {\tt objcopy} to convert the ELF binary into requested output format. |
||
173 | |||
174 | \section{Virtual environments} |
||
175 | After the build tools, simulators, emulators and virtualizers were the second focal point |
||
176 | in our project. These invaluable programs really sped the code-compile-test cycle. |
||
177 | In some cases, they were, and still are, the only option to actually run HelenOS on certain |
||
178 | processor architectures, because real hardware was not available to us. Using virtual environment |
||
33 | jermar | 179 | for developing our system provided us with deterministic environment on which it is much easier to do |
30 | jermar | 180 | troubleshooting. Moreover, part of the simulators featured integrated debugging facilities. |
181 | Without them, a lot of bugs would remain unresolved or even go unnoticed. |
||
182 | |||
42 | jermar | 183 | Using several virtual environments for testing one architecture is well justified by the |
184 | fact that sometimes HelenOS would run on two and crash on third or vice versa. Sometimes |
||
185 | we found that it runs on real hardware but fails in a simulator. The opposite case was, |
||
186 | however, more common. Simply put, the more configurations, no matter whether real or virtual, |
||
187 | the better. |
||
188 | |||
30 | jermar | 189 | From one point of view, we have tested our system on eight different virtual environments: |
190 | |||
191 | \begin{itemize} |
||
192 | \item Bochs, |
||
193 | \item GXemul, |
||
194 | \item msim, |
||
195 | \item PearPC, |
||
196 | \item QEMU, |
||
197 | \item Simics, |
||
198 | \item Ski, |
||
199 | \item VMware. |
||
200 | \end{itemize} |
||
201 | |||
202 | From the second point of view, we have tested these programs by our operating system. |
||
203 | Because of the scope and uniqueness of this testing and because we did find some issues, |
||
204 | we want to dedicate some more space to what we have found. |
||
205 | |||
206 | \subsection{Bochs} |
||
78 | jermar | 207 | Bochs\cite{bochs} has been used to develop the SPARTAN kernel since its beginning in 2001. |
30 | jermar | 208 | It is capable of emulating ia32 machine and for some time also amd64. |
209 | Bochs is an emulator and thus the slowest from virtual environments capable |
||
210 | of simulating the same cathegory of hardware. On the other hand, it is extremely |
||
211 | portable, compared to much faster virtualizers and emulators using dynamic translation |
||
212 | of instructions. Lately, there have been some plans to develop or port dynamic translation |
||
213 | to Bochs brewing in its developer community. |
||
214 | |||
215 | The biggest virtue of Bochs is that it has traditionally supported SMP. For some time, Bochs |
||
33 | jermar | 216 | has been our only environment on which we could develop and test SMP code. Unfortunatelly, |
30 | jermar | 217 | the quality of SMP support in Bochs was different from version to version. Because of SMP |
218 | breakage in Bochs, we had to avoid some versions thereof. So far, Bochs versions 2.2.1 and 2.2.6 |
||
219 | have been best in this regard. |
||
220 | |||
221 | Our project has not only used Bochs. We also helped to identify some SMP related problems |
||
222 | and {\OP} from our team has discovered and also fixed a bug in FXSAVE and FXRSTOR emulation |
||
223 | (patch \#1282033). |
||
224 | |||
225 | Bochs has some debugging facilities but those have been very impractical and broken |
||
42 | jermar | 226 | in SMP mode. Moreover, it is possible to use the GNU debugger {\tt gbd} to connect to running |
227 | simulation, but this has also proven not very useful as we often needed to debug |
||
228 | problems that existed only in multiprocessor configurations, which {\tt gdb} |
||
229 | does not understand. |
||
30 | jermar | 230 | |
231 | \subsection{GXemul} |
||
78 | jermar | 232 | GXemul\cite{gxemul} is an emulator of several processor architectures. Nevertheless, we have |
30 | jermar | 233 | used it only for mips32 emulation in both little-endian and big-endian modes. |
234 | It seems to be pretty featurefull and evolving but we don't use all its functionality. |
||
235 | GXemul is very user friendly and has debugging features. It is more realistic |
||
236 | than msim. However, our newly introduced TLS support triggered a bug in the {\tt rdhwr} |
||
237 | instruction emulation while msim functioned as expected. Fortunatelly, the author |
||
238 | of GXemul is very cooperative and has fixed the problem for future versions as well as |
||
239 | provided a quick hack for the old version. |
||
240 | |||
241 | \subsection{msim} |
||
78 | jermar | 242 | msim\cite{msim} has been our first mips32 simulator. It simulates 32-bit side of R4000 processor. |
30 | jermar | 243 | Its simulated environment is not very realistic, but the processor simulation |
244 | is good enough for operating system development. In this regard, the simulator is |
||
245 | comparable to HP's ia64 simulator Ski. Another similar aspect of these two is |
||
246 | relatively strong debugger. |
||
247 | |||
248 | Msim has been developed on the same alma mater as our own project. |
||
249 | All members of our team know this program from operating system courses. |
||
250 | Curiously, this simulator contained the biggest number of defects and inaccuracies |
||
251 | that we have ever discovered in a simulator. Fortunately, all of them have been |
||
252 | eventually fixed. |
||
253 | |||
254 | \subsection{PearPC} |
||
78 | jermar | 255 | PearPC\cite{pearpc} is the only emulator on which we have run ppc32 port of HelenOS. It has |
30 | jermar | 256 | no debugging features, but fortunatelly its sources are available under |
257 | an open source license. This enabled {\OP} and {\MD} to alter its sources |
||
258 | in a way that this modified version allowed some basic debugging. |
||
259 | |||
260 | \subsection{QEMU} |
||
78 | jermar | 261 | QEMU\cite{qemu} emulates several processor architectures. We have used it to emulate |
30 | jermar | 262 | ia32 and amd64. It can simulate SMP, but contrary to Bochs, it uses dynamic |
263 | translation of emulated instructions and performs much better because of |
||
264 | that. |
||
265 | |||
32 | jermar | 266 | This emulator seemed to realistically emulate the {\tt hlt} instruction, |
267 | which was nice for those of us who use notebooks as their development |
||
268 | machine. |
||
269 | |||
42 | jermar | 270 | Similar to Bochs, QEMU simulation can be aided by {\tt gdb}. Debugging |
271 | with {\tt gdb} can be pretty comfortable\footnote{Especially when the kernel is |
||
272 | compiled with {\tt -g3}.} until one needs to debug a SMP kernel running on multiple |
||
273 | processors. |
||
274 | |||
30 | jermar | 275 | \subsection{Simics} |
78 | jermar | 276 | Virtutech's Simics\cite{simics} simulator can be compared to a Swiss-army knife for operating system debugging. |
32 | jermar | 277 | This proprietary piece of software was available to us under an academic license for free. |
278 | |||
279 | Simics can be set to simulate many different configurations of many different machines. |
||
280 | It has the most advanced debugging features we have ever seen. To highlight some, its |
||
281 | memory access tracing ability has been really helpfull to us. During device driver |
||
282 | development, we appreciated the possibility to turn logging of the devices to a specified |
||
283 | verbosity. |
||
284 | |||
148 | palkovsky | 285 | We used it to test and develop amd64 and ia32 architectures in SMP mode and mips32 architecture in UP mode. Simics emulates the 4Kc processor on the MIPS architecture. |
286 | Unfortunately, this processor does not have an exception Reserved Instruction, which |
||
287 | makes it unusable in an environment with programs using thread local storage. |
||
32 | jermar | 288 | |
289 | Regardless of its invaluable qualities, it has still contained bugs. One of the most |
||
290 | serious was bug with ticket \#3351. {\OP} discovered that its BIOS rewrites kernel memory |
||
291 | during application processors start. Another bugs found were related to amd64 and mips32. |
||
292 | As for amd64, Simics did not report general protection fault when {\tt EFER.NXE} was 0 and a non-executable |
||
293 | page was found (\#4214). As for mips32, Simics misemulated {\tt MSUB} and {\tt MSUBU} instructions. |
||
294 | |||
30 | jermar | 295 | \subsection{Ski} |
78 | jermar | 296 | The ia64 port of HelenOS has been developed and debugged on the HP's IA-64 Ski\cite{ski} simulator. |
33 | jermar | 297 | Ski is just an Itanium processor simulator and as such does not simulate a real machine. In fact, there |
298 | is no firmware and no configuration tables (e.g. memory map) present in Ski! On the other hand, the missing parts can be supplied externally\footnote{This |
||
299 | is actually how Linux runs in this simulator.}. The simulator provides means of interaction with |
||
300 | host system devices via Simulator SystemCalls (SSC). The simulator itself has graphical interface |
||
301 | with pretty powerful, but not as good as those of Simics, debugging facilities. |
||
302 | |||
303 | Ski is a proprietary program with no source code available. Its binaries are available |
||
304 | for free under a non-free license. It comes packaged with insufficient documentation |
||
305 | which makes the development pretty problematic. For instance, there is no public documentation |
||
306 | of all the SSC's. All one can do is to look into Linux/ia64-Ski port, which was written by the |
||
307 | same people as Ski, and use it as a refernce. We had to look into Linux once more when our kernel |
||
308 | started to fail in some memory-intensive stress tests. In fact, the problem was that the tests |
||
309 | hit the IA-32 legacy videoram area. We fixed the problem, in the light of absence of any memory map, by blacklisting |
||
310 | this piece of memory to our frame allocator. |
||
311 | |||
312 | The way HelenOS is booted on Ski is by simply loading its ELF image |
||
313 | and jumping to it. The ELF header contains two fields describing where and how to load the program image into memory: |
||
314 | VMA and LMA. VMA\footnote{Virtual Memory Address} is an address where the program's segment gets mapped in virtual memory. |
||
315 | LMA\footnote{Load Memory Address} is the physical address where the segment is loaded in memory. {\JV} discovered |
||
316 | that Ski confuses VMA and LMA. This, what we believe to be a bug in Ski, has not shown in Linux since Linux always has |
||
317 | LMA equal to VMA. People from the Ski mailing list had tried to help us but our repeated problem report didn't |
||
318 | make it far enough for the HP to fix or at least clarify the issue. Finally, we adopted a workaround implemented by {\JJ} |
||
319 | that simply swaps LMA and the program entry point in the kernel ELF image. |
||
320 | |||
78 | jermar | 321 | \subsection{VMware} VMware\cite{vmware} is the only virtualizer we have used in |
42 | jermar | 322 | HelenOS development. It virtualizes the ia32 host machine. Since VMware |
323 | version 5.5, we made use of its possibility to run the guest system |
||
324 | (i.e. HelenOS) on multiple processors. VMware has no support for |
||
325 | debugging but is very useful for compatibility and regression testing |
||
326 | because it's closest to the real hardware. VMware, being a virtualizer, |
||
327 | is also the fastest of all the virtual environments we have utilized. |
||
30 | jermar | 328 | |
33 | jermar | 329 |