Subversion Repositories HelenOS-doc

Compare Revisions

Ignore whitespace Rev 130 → Rev 131

/design/trunk/src/ch_ds.xml
56,7 → 56,7
</section>
 
<section>
<title>FIFO queues</title>
<title>FIFO Queues</title>
 
<indexterm>
<primary>FIFO queue</primary>
96,7 → 96,7
</section>
 
<section id="hashtables">
<title>Hash tables</title>
<title>Hash Tables</title>
 
<indexterm>
<primary>hash table</primary>
127,7 → 127,6
 
<title>Generic hash table.</title>
</figure>
 
</section>
 
<section>
147,7 → 146,8
 
<indexterm>
<primary>B-trees</primary>
<secondary> - B+tree</secondary>
 
<secondary>- B+tree</secondary>
</indexterm>
 
<para>HelenOS makes use of a variant of B-tree called B+tree. B+trees, in
/design/trunk/src/ch_intro.xml
15,15 → 15,29
look for interface details there.</para>
 
<section>
<title>How to read this book</title>
<title>How to Read This Book</title>
 
<para><xref linkend="architecture" /> contains overview of the overall HelenOS architecture.</para>
<para><xref linkend="ds" /> describes essential data structures used both in the kernel and in the userspace.</para>
<para><xref linkend="time" /> focuses on time management in the kernel and scheds some light on the userspace source of time.</para>
<para><xref linkend="scheduling" /> is dedicated to threads and the scheduling subsystem.</para>
<para><xref linkend="mm" /> describes memory management of physical and virtual memory.</para>
<para><xref linkend="architecture" /> contains overview of the overall
HelenOS architecture.</para>
 
<para><xref linkend="ds" /> describes essential data structures used both
in the kernel and in the userspace.</para>
 
<para><xref linkend="time" /> focuses on time management in the kernel and
scheds some light on the userspace source of time.</para>
 
<para><xref linkend="scheduling" /> is dedicated to threads and the
scheduling subsystem.</para>
 
<para><xref linkend="mm" /> describes memory management of physical and
virtual memory.</para>
 
<para><xref linkend="ipc" /> deals with the IPC subsystem.</para>
<para><xref linkend="hardware" /> describes facilities that a userspace task can use in order to become a device driver.</para>
<para><xref linkend="archspecs" /> presents some architecture specific issues.</para>
 
<para><xref linkend="hardware" /> describes facilities that a userspace
task can use in order to become a device driver.</para>
 
<para><xref linkend="archspecs" /> presents some architecture specific
issues.</para>
</section>
</chapter>
/design/trunk/src/ch_hardware.xml
2,11 → 2,11
<chapter id="hardware">
<?dbhtml filename="hardware.html"?>
 
<title>Device drivers</title>
<title>Device Drivers</title>
 
<para>Since HelenOS is a microkernel, a framework for supporting userspace
device drivers has been implemented. A typical userspace task acting as a
device driver might need to: </para>
device driver might need to:</para>
 
<itemizedlist>
<listitem>
27,7 → 27,7
</itemizedlist>
 
<section>
<title>Interrupt notifications</title>
<title>Interrupt Notifications</title>
 
<para>Userspace tasks that are in hold of the
<constant>CAP_IRQ_REG</constant> capability can register themselves via
53,7 → 53,7
</section>
 
<section>
<title>Accessing memory and I/O space</title>
<title>Accessing Memory and I/O Space</title>
 
<para>When a task has the <constant>CAP_MEM_MANAGER</constant> capability,
it can use the <constant>SYS_MAP_PHYSMEM</constant> to map regions of
68,7 → 68,7
</section>
 
<section>
<title>Disabling preemption</title>
<title>Disabling Preemption</title>
 
<para>It might be desirable for a device driver to temporarily disable
preemption. Tasks that can do this are required to have the
/design/trunk/src/ch_time.xml
2,7 → 2,7
<chapter id="time">
<?dbhtml filename="time.html"?>
 
<title>Time management</title>
<title>Time Management</title>
 
<para>Time is one of the dimensions in which kernel, as well as the whole
system, operates. It is of special importance to many kernel subsytems.
13,7 → 13,7
to implement timeouting versions of synchronization primitives.</para>
 
<section>
<title>System clock</title>
<title>System Clock</title>
 
<para>Every hardware architecture supported by HelenOS must support some
kind of a device that can be programmed to yield periodic time signals
84,7 → 84,7
<para>Kernel subsystems can register a callback function to be executed
with a specified delay. Such a registration is represented by a kernel
structure called <classname>timeout</classname>. Timeouts are registered
via <code>timeout_register()</code> function. This function takes a pointer
via <code>timeout_register</code> function. This function takes a pointer
to a timeout structure, a callback function, a parameter of the callback
function and a delay in microseconds as parameters. After the structure is
initialized with all these values, it is sorted into the processor's list
91,8 → 91,8
of active timeouts, according to the number of clock interrupts remaining
to their expiration and relatively to already listed timeouts.</para>
 
<para>Timeouts can be unregistered via <code>timeout_unregister()</code>.
This function can, as opposed to <code>timeout_register()</code>, fail when
<para>Timeouts can be unregistered via <code>timeout_unregister</code>.
This function can, as opposed to <code>timeout_register</code>, fail when
it is too late to remove the timeout from the list of active
timeouts.</para>
 
107,11 → 107,11
</section>
 
<section>
<title>Generic clock interrupt handler</title>
<title>Generic Clock Interrupt Handler</title>
 
<para>On each clock interrupt, the architecture specific part of the clock
interrupt handler makes a call to the generic clock interrupt handler
implemented by the <code>clock()</code> function. The generic handler takes
implemented by the <code>clock</code> function. The generic handler takes
care of several mission critical goals:</para>
 
<itemizedlist>
128,7 → 128,7
</listitem>
</itemizedlist>
 
<para>The <code>clock()</code> function checks for expired timeouts and
<para>The <code>clock</code> function checks for expired timeouts and
decrements unexpired timeout expiration counters exactly one more times
than is the number of missed clock signals (i.e. at least once and
possibly more times, depending on the missed clock signals counter). The
141,7 → 141,7
</section>
 
<section>
<title>Time source for userspace</title>
<title>Time Source for Userspace</title>
 
<para>In HelenOS, userspace tasks don't communicate with the kernel in
order to read the system time. Instead, a mechanism that shares kernel
/design/trunk/src/ch_synchronization.xml
17,7 → 17,7
</section>
 
<section>
<title>Active kernel primitives</title>
<title>Active Kernel Primitives</title>
 
<section>
<indexterm>
72,13 → 72,13
a result, the processor needs to be explicitly told about each
occurrence of such a dependency. Therefore, HelenOS adds
architecture-specific hooks to all <code>spinlock_lock()</code>,
<code>spinlock_trylock()</code> and <code>spinlock_unlock()</code> functions
to prevent the instructions inside the critical section from permeating
out. On some architectures, these hooks can be void because the
dependencies are implicitly there because of the special properties of
locking and unlocking instructions. However, other architectures need to
instrument these hooks with different memory barriers, depending on what
operations could permeate out.</para>
<code>spinlock_trylock()</code> and <code>spinlock_unlock()</code>
functions to prevent the instructions inside the critical section from
permeating out. On some architectures, these hooks can be void because
the dependencies are implicitly there because of the special properties
of locking and unlocking instructions. However, other architectures need
to instrument these hooks with different memory barriers, depending on
what operations could permeate out.</para>
 
<para>Spinlocks have one significant drawback: when held for longer time
periods, they harm both parallelism and concurrency. The processor
102,7 → 102,7
</section>
 
<section>
<title>Passive kernel synchronization</title>
<title>Passive Kernel Synchronization</title>
 
<section>
<indexterm>
111,7 → 111,7
<secondary>- wait queue</secondary>
</indexterm>
 
<title>Wait queues</title>
<title>Wait Queues</title>
 
<para>A wait queue is the basic passive synchronization primitive on
which all other passive synchronization primitives are built. Simply
124,8 → 124,8
queue are protected by a spinlock.</para>
 
<para>The thread that wants to wait for a wait queue event uses the
<code>waitq_sleep_timeout()</code> function. The algorithm then checks the
wait queue's counter of missed wakeups and if there are any missed
<code>waitq_sleep_timeout()</code> function. The algorithm then checks
the wait queue's counter of missed wakeups and if there are any missed
wakeups, the call returns immediately. The call also returns immediately
if only a conditional wait was requested. Otherwise the thread is
enqueued in the wait queue's list of sleeping threads and its state is
134,14 → 134,14
 
<orderedlist>
<listitem>
<para>another thread calls <code>waitq_wakeup()</code> and the thread
is the first thread in the wait queue's list of sleeping
<para>another thread calls <code>waitq_wakeup()</code> and the
thread is the first thread in the wait queue's list of sleeping
threads;</para>
</listitem>
 
<listitem>
<para>another thread calls <code>waitq_interrupt_sleep()</code> on the
sleeping thread;</para>
<para>another thread calls <code>waitq_interrupt_sleep()</code> on
the sleeping thread;</para>
</listitem>
 
<listitem>
153,16 → 153,16
 
<para>All five possibilities (immediate return on success, immediate
return on failure, wakeup after sleep, interruption and timeout) are
distinguishable by the return value of <code>waitq_sleep_timeout()</code>.
Being able to interrupt a sleeping thread is essential for externally
initiated thread termination. The ability to wait only for a certain
amount of time is used, for instance, to passively delay thread
execution by several microseconds or even seconds in
<code>thread_sleep()</code> function. Due to the fact that all other
passive kernel synchronization primitives are based on wait queues, they
also have the option of being interrutped and, more importantly, can
timeout. All of them also implement the conditional operation.
Furthemore, this very fundamental interface reaches up to the
distinguishable by the return value of
<code>waitq_sleep_timeout()</code>. Being able to interrupt a sleeping
thread is essential for externally initiated thread termination. The
ability to wait only for a certain amount of time is used, for instance,
to passively delay thread execution by several microseconds or even
seconds in <code>thread_sleep()</code> function. Due to the fact that
all other passive kernel synchronization primitives are based on wait
queues, they also have the option of being interrutped and, more
importantly, can timeout. All of them also implement the conditional
operation. Furthemore, this very fundamental interface reaches up to the
implementation of futexes - userspace synchronization primitive, which
makes it possible for a userspace thread to request a synchronization
operation with a timeout or a conditional operation.</para>
169,8 → 169,8
 
<para>From the description above, it should be apparent, that when a
sleeping thread is woken by <code>waitq_wakeup()</code> or when
<code>waitq_sleep_timeout()</code> succeeds immediately, the thread can be
sure that the event has occurred. The thread need not and should not
<code>waitq_sleep_timeout()</code> succeeds immediately, the thread can
be sure that the event has occurred. The thread need not and should not
verify this fact. This approach is called direct hand-off and is
characteristic for all passive HelenOS synchronization primitives, with
the exception as described below.</para>
187,14 → 187,15
 
<para>The interesting point about wait queues is that the number of
missed wakeups is equal to the number of threads that will not block in
<code>watiq_sleep_timeout()</code> and would immediately succeed instead.
On the other hand, semaphores are synchronization primitives that will
let predefined amount of threads into their critical section and block
any other threads above this count. However, these two cases are exactly
the same. Semaphores in HelenOS are therefore implemented as wait queues
with a single semantic change: their wait queue is initialized to have
so many missed wakeups as is the number of threads that the semphore
intends to let into its critical section simultaneously.</para>
<code>watiq_sleep_timeout()</code> and would immediately succeed
instead. On the other hand, semaphores are synchronization primitives
that will let predefined amount of threads into their critical section
and block any other threads above this count. However, these two cases
are exactly the same. Semaphores in HelenOS are therefore implemented as
wait queues with a single semantic change: their wait queue is
initialized to have so many missed wakeups as is the number of threads
that the semphore intends to let into its critical section
simultaneously.</para>
 
<para>In the semaphore language, the wait queue operation
<code>waitq_sleep_timeout()</code> corresponds to semaphore
228,7 → 229,7
</section>
 
<section>
<title>Reader/writer locks</title>
<title>Reader/Writer Locks</title>
 
<indexterm>
<primary>synchronization</primary>
281,8 → 282,8
<para>The implementation of rwlocks as it has been already put, makes
use of one single wait queue for both readers and writers, thus avoiding
any possibility of starvation. In fact, rwlocks use a mutex rather than
a bare wait queue. This mutex is called <emphasis>exclusive</emphasis> and is
used to synchronize writers. The writer's lock operation,
a bare wait queue. This mutex is called <emphasis>exclusive</emphasis>
and is used to synchronize writers. The writer's lock operation,
<code>rwlock_write_lock_timeout()</code>, simply tries to acquire the
exclusive mutex. If it succeeds, the writer is granted the rwlock.
However, if the operation fails (e.g. times out), the writer must check
292,8 → 293,8
 
<para>The exclusive mutex plays an important role in reader
synchronization as well. However, a reader doing the reader's lock
operation, <code>rwlock_read_lock_timeout()</code>, may bypass this mutex
when it detects that:</para>
operation, <code>rwlock_read_lock_timeout()</code>, may bypass this
mutex when it detects that:</para>
 
<orderedlist>
<listitem>
315,7 → 316,7
</section>
 
<section>
<title>Condition variables</title>
<title>Condition Variables</title>
 
<indexterm>
<primary>synchronization</primary>
352,22 → 353,23
<function>mutex_unlock</function>(<varname>mtx</varname>);</programlisting>
</example>
 
<para>The wait operation, <code>condvar_wait_timeout()</code>, always puts
the calling thread to sleep. The thread then sleeps until another thread
invokes <code>condvar_broadcast()</code> on the same condition variable or
until it is woken up by <code>condvar_signal()</code>. The
<code>condvar_signal()</code> operation unblocks the first thread blocking
on the condition variable while the <code>condvar_broadcast()</code>
operation unblocks all threads blocking there. If there are no blocking
threads, these two operations have no efect.</para>
<para>The wait operation, <code>condvar_wait_timeout()</code>, always
puts the calling thread to sleep. The thread then sleeps until another
thread invokes <code>condvar_broadcast()</code> on the same condition
variable or until it is woken up by <code>condvar_signal()</code>. The
<code>condvar_signal()</code> operation unblocks the first thread
blocking on the condition variable while the
<code>condvar_broadcast()</code> operation unblocks all threads blocking
there. If there are no blocking threads, these two operations have no
efect.</para>
 
<para>Note that the threads must synchronize over a dedicated mutex. To
prevent race condition between <code>condvar_wait_timeout()</code> and
<code>condvar_signal()</code> or <code>condvar_broadcast()</code>, the mutex
is passed to <code>condvar_wait_timeout()</code> which then atomically
puts the calling thread asleep and unlocks the mutex. When the thread
eventually wakes up, <code>condvar_wait()</code> regains the mutex and
returns.</para>
<code>condvar_signal()</code> or <code>condvar_broadcast()</code>, the
mutex is passed to <code>condvar_wait_timeout()</code> which then
atomically puts the calling thread asleep and unlocks the mutex. When
the thread eventually wakes up, <code>condvar_wait()</code> regains the
mutex and returns.</para>
 
<para>Also note, that there is no conditional operation for condition
variables. Such an operation would make no sence since condition
400,8 → 402,8
 
<orderedlist>
<listitem>
<para><code>waitq_sleep_prepare()</code> prepares the thread to go to
sleep by, among other things, locking the wait queue;</para>
<para><code>waitq_sleep_prepare()</code> prepares the thread to go
to sleep by, among other things, locking the wait queue;</para>
</listitem>
 
<listitem>
416,9 → 418,9
</listitem>
</orderedlist>
 
<para>The stock <code>waitq_sleep_timeout()</code> is then a mere wrapper
that calls these three functions. It is provided for convenience in
cases where the caller doesn't require such a low level control.
<para>The stock <code>waitq_sleep_timeout()</code> is then a mere
wrapper that calls these three functions. It is provided for convenience
in cases where the caller doesn't require such a low level control.
However, the implementation of <code>condvar_wait_timeout()</code> does
need this finer-grained control because it has to interleave calls to
these functions by other actions. It carries its operations out in the
454,7 → 456,7
</section>
 
<section>
<title>Userspace synchronization</title>
<title>Userspace Synchronization</title>
 
<section>
<title>Futexes</title>
516,11 → 518,11
If the counter decreased to 0, then there was no contention and the
thread can enter the critical section protected by the futex. When the
thread later leaves that critical section, it, using library function
<code>futex_up()</code>, atomically increments the counter. If the counter
value increased to one, then there again was no contention and no action
needs to be done. However, if it increased to zero or even a smaller
number, then there are sleeping threads waiting for the futex to become
available. In that case, the library has to use the
<code>futex_up()</code>, atomically increments the counter. If the
counter value increased to one, then there again was no contention and
no action needs to be done. However, if it increased to zero or even a
smaller number, then there are sleeping threads waiting for the futex to
become available. In that case, the library has to use the
<constant>SYS_FUTEX_WAKEUP</constant> syscall to wake one sleeping
thread.</para>
 
/design/trunk/src/ch_arch_overview.xml
2,7 → 2,7
<chapter id="architecture">
<?dbhtml filename="arch.html"?>
 
<title>Architecture overview</title>
<title>Architecture Overview</title>
 
<para>The HelenOS operating system is designed as a relatively small
microkernel assisted with a set of userspace drivers and server tasks.
76,7 → 76,7
</section>
 
<section>
<title>Memory management</title>
<title>Memory Management</title>
 
<para>Memory management is another large subsystem in HelenOS. It serves
the kernel to satisfy its own memory allocation requests, provides
/design/trunk/src/ch_scheduling.xml
73,7 → 73,7
spaces on hardware level (i.e. ASIDs and page tables pointers).</para>
 
<section>
<title>Synchronous context switches</title>
<title>Synchronous Context Switches</title>
 
<para>The scheduler, but also other pieces of the kernel, make heavy use
of synchronous context switches, because it is a natural vehicle not
122,7 → 122,7
possible.</para>
 
<formalpara>
<title>Thread states</title>
<title>Thread States</title>
 
<para>In each moment, a thread exists in one of six possible thread
states. When the thread is created and first readied into the
146,8 → 146,7
 
<mediaobject id="thread_states" xreflabel="">
<imageobject role="pdf">
<imagedata fileref="images/thread_states.pdf"
format="PDF" />
<imagedata fileref="images/thread_states.pdf" format="PDF" />
</imageobject>
 
<imageobject role="html">
155,8 → 154,7
</imageobject>
 
<imageobject role="fop">
<imagedata fileref="images/thread_states.svg"
format="SVG" />
<imagedata fileref="images/thread_states.svg" format="SVG" />
</imageobject>
</mediaobject>
</figure></para>
163,7 → 161,7
</formalpara>
 
<formalpara>
<title>Pseudo threads</title>
<title>Pseudo Threads</title>
 
<para>HelenOS userspace layer knows even smaller units of execution.
Each userspace thread can make use of an arbitrary number of pseudo
194,7 → 192,7
<title>Scheduler</title>
 
<section>
<title>Run queues</title>
<title>Run Queues</title>
 
<para>There is an array of several run queues on each processor. The
current version of HelenOS uses 16 run queues implemented by 16 doubly
206,7 → 204,7
</section>
 
<section>
<title>Scheduler operation</title>
<title>Scheduler Operation</title>
 
<para>The scheduler is invoked either explicitly when a thread calls the
<code>scheduler</code> function (e.g. goes to sleep or merely wants to
255,7 → 253,7
</section>
 
<section>
<title>Processor load balancing</title>
<title>Processor Load Balancing</title>
 
<para>Normally, for the sake of cache locality, threads are scheduled on
one of the processors and don't leave it. Nevertheless, a situation in