41,19 → 41,18 |
remains unavailable until this operation called on the respective |
spinlock returns zero. HelenOS builds two functions on top of the |
test-and-set operation. The first function is the unconditional attempt |
to acquire the spinlock and is called |
<emphasis>spinlock_lock</emphasis>. It simply loops until the |
test-and-set returns a zero value. The other function, |
<emphasis>spinlock_trylock</emphasis>, is the conditional lock operation |
and calls the test-and-set only once to find out whether it managed to |
acquire the spinlock or not. The conditional operation is useful in |
situations in which an algorithm cannot acquire more spinlocks in the |
proper order and a deadlock cannot be avoided. In such a case, the |
algorithm would detect the danger and instead of possibly deadlocking |
the system it would simply release some spinlocks it already holds and |
retry the whole operation with the hope that it will succeed next time. |
The unlock function, <emphasis>spinlock_unlock</emphasis>, is quite easy |
- it merely clears the spinlock variable.</para> |
to acquire the spinlock and is called <code>spinlock_lock</code>. It |
simply loops until the test-and-set returns a zero value. The other |
function, <code>spinlock_trylock</code>, is the conditional lock |
operation and calls the test-and-set only once to find out whether it |
managed to acquire the spinlock or not. The conditional operation is |
useful in situations in which an algorithm cannot acquire more spinlocks |
in the proper order and a deadlock cannot be avoided. In such a case, |
the algorithm would detect the danger and instead of possibly |
deadlocking the system it would simply release some spinlocks it already |
holds and retry the whole operation with the hope that it will succeed |
next time. The unlock function, <code>spinlock_unlock</code>, is quite |
easy - it merely clears the spinlock variable.</para> |
|
<para>Nevertheless, there is a special issue related to hardware |
optimizations that modern processors implement. Particularly problematic |
66,20 → 65,19 |
respective spinlock is not implicit on some processor architectures. As |
a result, the processor needs to be explicitly told about each |
occurrence of such a dependency. Therefore, HelenOS adds |
architecture-specific hooks to all <emphasis>spinlock_lock</emphasis>, |
<emphasis>spinlock_trylock</emphasis> and |
<emphasis>spinlock_unlock</emphasis> functions to prevent the |
instructions inside the critical section from permeating out. On some |
architectures, these hooks can be void because the dependencies are |
implicitly there because of the special properties of locking and |
unlocking instructions. However, other architectures need to instrument |
these hooks with different memory barriers, depending on what operations |
could permeate out.</para> |
architecture-specific hooks to all <code>spinlock_lock</code>, |
<code>spinlock_trylock</code> and <code>spinlock_unlock</code> functions |
to prevent the instructions inside the critical section from permeating |
out. On some architectures, these hooks can be void because the |
dependencies are implicitly there because of the special properties of |
locking and unlocking instructions. However, other architectures need to |
instrument these hooks with different memory barriers, depending on what |
operations could permeate out.</para> |
|
<para>Spinlocks have one significant drawback: when held for longer time |
periods, they harm both parallelism and concurrency. The processor |
executing <emphasis>spinlock_lock</emphasis> does not do any fruitful |
work and is effectively halted until it can grab the lock and proceed. |
executing <code>spinlock_lock</code> does not do any fruitful work and |
is effectively halted until it can grab the lock and proceed. |
Similarily, other execution flows cannot execute on the processor that |
holds the spinlock because the kernel disables preemption on that |
processor when a spinlock is held. The reason behind disabling |
114,25 → 112,24 |
queue are protected by a spinlock.</para> |
|
<para>The thread that wants to wait for a wait queue event uses the |
<emphasis>waitq_sleep_timeout</emphasis> function. The algorithm then |
checks the wait queue's counter of missed wakeups and if there are any |
missed wakeups, the call returns immediately. The call also returns |
immediately if only a conditional wait was requested. Otherwise the |
thread is enqueued in the wait queue's list of sleeping threads and its |
state is changed to <emphasis>Sleeping</emphasis>. It then sleeps until |
one of the following events happens:</para> |
<code>waitq_sleep_timeout</code> function. The algorithm then checks the |
wait queue's counter of missed wakeups and if there are any missed |
wakeups, the call returns immediately. The call also returns immediately |
if only a conditional wait was requested. Otherwise the thread is |
enqueued in the wait queue's list of sleeping threads and its state is |
changed to <constant>Sleeping</constant>. It then sleeps until one of |
the following events happens:</para> |
|
<orderedlist> |
<listitem> |
<para>another thread calls <emphasis>waitq_wakeup</emphasis> and the |
thread is the first thread in the wait queue's list of sleeping |
<para>another thread calls <code>waitq_wakeup</code> and the thread |
is the first thread in the wait queue's list of sleeping |
threads;</para> |
</listitem> |
|
<listitem> |
<para>another thread calls |
<emphasis>waitq_interrupt_sleep</emphasis> on the sleeping |
thread;</para> |
<para>another thread calls <code>waitq_interrupt_sleep</code> on the |
sleeping thread;</para> |
</listitem> |
|
<listitem> |
144,28 → 141,27 |
|
<para>All five possibilities (immediate return on success, immediate |
return on failure, wakeup after sleep, interruption and timeout) are |
distinguishable by the return value of |
<emphasis>waitq_sleep_timeout</emphasis>. Being able to interrupt a |
sleeping thread is essential for externally initiated thread |
termination. The ability to wait only for a certain amount of time is |
used, for instance, to passively delay thread execution by several |
microseconds or even seconds in <emphasis>thread_sleep</emphasis> |
function. Due to the fact that all other passive kernel synchronization |
primitives are based on wait queues, they also have the option of being |
interrutped and, more importantly, can timeout. All of them also |
implement the conditional operation. Furthemore, this very fundamental |
interface reaches up to the implementation of futexes - userspace |
synchronization primitive, which makes it possible for a userspace |
thread to request a synchronization operation with a timeout or a |
conditional operation.</para> |
distinguishable by the return value of <code>waitq_sleep_timeout</code>. |
Being able to interrupt a sleeping thread is essential for externally |
initiated thread termination. The ability to wait only for a certain |
amount of time is used, for instance, to passively delay thread |
execution by several microseconds or even seconds in |
<code>thread_sleep</code> function. Due to the fact that all other |
passive kernel synchronization primitives are based on wait queues, they |
also have the option of being interrutped and, more importantly, can |
timeout. All of them also implement the conditional operation. |
Furthemore, this very fundamental interface reaches up to the |
implementation of futexes - userspace synchronization primitive, which |
makes it possible for a userspace thread to request a synchronization |
operation with a timeout or a conditional operation.</para> |
|
<para>From the description above, it should be apparent, that when a |
sleeping thread is woken by <emphasis>waitq_wakeup</emphasis> or when |
<emphasis>waitq_sleep_timeout</emphasis> succeeds immediately, the |
thread can be sure that the event has occurred. The thread need not and |
should not verify this fact. This approach is called direct hand-off and |
is characteristic for all passive HelenOS synchronization primitives, |
with the exception as described below.</para> |
sleeping thread is woken by <code>waitq_wakeup</code> or when |
<code>waitq_sleep_timeout</code> succeeds immediately, the thread can be |
sure that the event has occurred. The thread need not and should not |
verify this fact. This approach is called direct hand-off and is |
characteristic for all passive HelenOS synchronization primitives, with |
the exception as described below.</para> |
</section> |
|
<section> |
173,24 → 169,23 |
|
<para>The interesting point about wait queues is that the number of |
missed wakeups is equal to the number of threads that will not block in |
<emphasis>watiq_sleep_timeout</emphasis> and would immediately succeed |
instead. On the other hand, semaphores are synchronization primitives |
that will let predefined amount of threads into their critical section |
and block any other threads above this count. However, these two cases |
are exactly the same. Semaphores in HelenOS are therefore implemented as |
wait queues with a single semantic change: their wait queue is |
initialized to have so many missed wakeups as is the number of threads |
that the semphore intends to let into its critical section |
simultaneously.</para> |
<code>watiq_sleep_timeout</code> and would immediately succeed instead. |
On the other hand, semaphores are synchronization primitives that will |
let predefined amount of threads into their critical section and block |
any other threads above this count. However, these two cases are exactly |
the same. Semaphores in HelenOS are therefore implemented as wait queues |
with a single semantic change: their wait queue is initialized to have |
so many missed wakeups as is the number of threads that the semphore |
intends to let into its critical section simultaneously.</para> |
|
<para>In the semaphore language, the wait queue operation |
<emphasis>waitq_sleep_timeout</emphasis> corresponds to |
<emphasis><emphasis>semaphore</emphasis> down</emphasis> operation, |
represented by the function <emphasis>semaphore_down_timeout</emphasis> |
and by way of similitude the wait queue operation waitq_wakeup |
corresponds to semaphore <emphasis>up</emphasis> operation, represented |
by the function <emphasis>sempafore_up</emphasis>. The conditional down |
operation is called <emphasis>semaphore_trydown</emphasis>.</para> |
<code>waitq_sleep_timeout</code> corresponds to semaphore |
<code>down</code> operation, represented by the function |
<code>semaphore_down_timeout</code> and by way of similitude the wait |
queue operation waitq_wakeup corresponds to semaphore <code>up</code> |
operation, represented by the function <code>sempafore_up</code>. The |
conditional down operation is called |
<code>semaphore_trydown</code>.</para> |
</section> |
|
<section> |
203,9 → 198,9 |
view, they can be viewed as spinlocks without busy waiting. Their |
semaphore heritage provides good basics for both conditional operation |
and operation with timeout. The locking operation is called |
<emphasis>mutex_lock</emphasis>, the conditional locking operation is |
called <emphasis>mutex_trylock</emphasis> and the unlocking operation is |
called <emphasis>mutex_unlock</emphasis>.</para> |
<code>mutex_lock</code>, the conditional locking operation is called |
<code>mutex_trylock</code> and the unlocking operation is called |
<code>mutex_unlock</code>.</para> |
</section> |
|
<section> |
256,10 → 251,10 |
<para>The implementation of rwlocks as it has been already put, makes |
use of one single wait queue for both readers and writers, thus avoiding |
any possibility of starvation. In fact, rwlocks use a mutex rather than |
a bare wait queue. This mutex is called <emphasis>exclusive</emphasis> |
and is used to synchronize writers. The writer's lock operation, |
<emphasis>rwlock_write_lock_timeout</emphasis>, simply tries to acquire |
the exclusive mutex. If it succeeds, the writer is granted the rwlock. |
a bare wait queue. This mutex is called <code>exclusive</code> and is |
used to synchronize writers. The writer's lock operation, |
<code>rwlock_write_lock_timeout</code>, simply tries to acquire the |
exclusive mutex. If it succeeds, the writer is granted the rwlock. |
However, if the operation fails (e.g. times out), the writer must check |
for potential readers at the head of the list of sleeping threads |
associated with the mutex's wait queue and then proceed according to the |
267,8 → 262,8 |
|
<para>The exclusive mutex plays an important role in reader |
synchronization as well. However, a reader doing the reader's lock |
operation, <emphasis>rwlock_read_lock_timeout</emphasis>, may bypass |
this mutex when it detects that:</para> |
operation, <code>rwlock_read_lock_timeout</code>, may bypass this mutex |
when it detects that:</para> |
|
<orderedlist> |
<listitem> |
314,31 → 309,28 |
<function>condvar_signal</function>(<varname>cv</varname>); /* <remark>condvar_broadcast(cv);</remark> */ |
<function>mutex_unlock</function>(<varname>mtx</varname>);</programlisting></para> |
|
<para>The wait operation, <emphasis>condvar_wait_timeout</emphasis>, |
always puts the calling thread to sleep. The thread then sleeps until |
another thread invokes <emphasis>condvar_broadcast</emphasis> on the |
same condition variable or until it is woken up by |
<emphasis>condvar_signal</emphasis>. The |
<emphasis>condvar_signal</emphasis> operation unblocks the first thread |
blocking on the condition variable while the |
<emphasis>condvar_broadcast</emphasis> operation unblocks all threads |
blocking there. If there are no blocking threads, these two operations |
have no efect.</para> |
<para>The wait operation, <code>condvar_wait_timeout</code>, always puts |
the calling thread to sleep. The thread then sleeps until another thread |
invokes <code>condvar_broadcast</code> on the same condition variable or |
until it is woken up by <code>condvar_signal</code>. The |
<code>condvar_signal</code> operation unblocks the first thread blocking |
on the condition variable while the <code>condvar_broadcast</code> |
operation unblocks all threads blocking there. If there are no blocking |
threads, these two operations have no efect.</para> |
|
<para>Note that the threads must synchronize over a dedicated mutex. To |
prevent race condition between <emphasis>condvar_wait_timeout</emphasis> |
and <emphasis>condvar_signal</emphasis> or |
<emphasis>condvar_broadcast</emphasis>, the mutex is passed to |
<emphasis>condvar_wait_timeout</emphasis> which then atomically puts the |
calling thread asleep and unlocks the mutex. When the thread eventually |
wakes up, <emphasis>condvar_wait</emphasis> regains the mutex and |
prevent race condition between <code>condvar_wait_timeout</code> and |
<code>condvar_signal</code> or <code>condvar_broadcast</code>, the mutex |
is passed to <code>condvar_wait_timeout</code> which then atomically |
puts the calling thread asleep and unlocks the mutex. When the thread |
eventually wakes up, <code>condvar_wait</code> regains the mutex and |
returns.</para> |
|
<para>Also note, that there is no conditional operation for condition |
variables. Such an operation would make no sence since condition |
variables are defined to forget events for which there is no waiting |
thread and because <emphasis>condvar_wait</emphasis> must always go to |
sleep. The operation with timeout is supported as usually.</para> |
thread and because <code>condvar_wait</code> must always go to sleep. |
The operation with timeout is supported as usually.</para> |
|
<para>In HelenOS, condition variables are based on wait queues. As it is |
already mentioned above, wait queues remember missed events while |
346,55 → 338,53 |
condition variables are designed for scenarios in which an event might |
occur very many times without being picked up by any waiting thread. On |
the other hand, wait queues would remember any event that had not been |
picked up by a call to <emphasis>waitq_sleep_timeout</emphasis>. |
Therefore, if wait queues were used directly and without any changes to |
implement condition variables, the missed_wakeup counter would hurt |
performance of the implementation: the <code>while</code> loop in |
<emphasis>condvar_wait_timeout</emphasis> would effectively do busy |
waiting until all missed wakeups were discarded.</para> |
picked up by a call to <code>waitq_sleep_timeout</code>. Therefore, if |
wait queues were used directly and without any changes to implement |
condition variables, the missed_wakeup counter would hurt performance of |
the implementation: the <code>while</code> loop in |
<code>condvar_wait_timeout</code> would effectively do busy waiting |
until all missed wakeups were discarded.</para> |
|
<para>The requirement on the wait operation to atomically put the caller |
to sleep and release the mutex poses an interesting problem on |
<emphasis>condvar_wait_timeout</emphasis>. More precisely, the thread |
should sleep in the condvar's wait queue prior to releasing the mutex, |
but it must not hold the mutex when it is sleeping.</para> |
<code>condvar_wait_timeout</code>. More precisely, the thread should |
sleep in the condvar's wait queue prior to releasing the mutex, but it |
must not hold the mutex when it is sleeping.</para> |
|
<para>Problems described in the two previous paragraphs are addressed in |
HelenOS by dividing the <emphasis>waitq_sleep_timeout</emphasis> |
function into three pieces:</para> |
HelenOS by dividing the <code>waitq_sleep_timeout</code> function into |
three pieces:</para> |
|
<orderedlist> |
<listitem> |
<para><emphasis>waitq_sleep_prepare</emphasis> prepares the thread |
to go to sleep by, among other things, locking the wait |
queue;</para> |
<para><code>waitq_sleep_prepare</code> prepares the thread to go to |
sleep by, among other things, locking the wait queue;</para> |
</listitem> |
|
<listitem> |
<para><emphasis>waitq_sleep_timeout_unsafe</emphasis> implements the |
core blocking logic;</para> |
<para><code>waitq_sleep_timeout_unsafe</code> implements the core |
blocking logic;</para> |
</listitem> |
|
<listitem> |
<para><emphasis>waitq_sleep_finish</emphasis> performs cleanup after |
<emphasis>waitq_sleep_timeout_unsafe</emphasis>; after this call, |
the wait queue spinlock is guaranteed to be unlocked by the |
caller</para> |
<para><code>waitq_sleep_finish</code> performs cleanup after |
<code>waitq_sleep_timeout_unsafe</code>; after this call, the wait |
queue spinlock is guaranteed to be unlocked by the caller</para> |
</listitem> |
</orderedlist> |
|
<para>The stock <emphasis>waitq_sleep_timeout</emphasis> is then a mere |
wrapper that calls these three functions. It is provided for convenience |
in cases where the caller doesn't require such a low level control. |
However, the implementation of <emphasis>condvar_wait_timeout</emphasis> |
does need this finer-grained control because it has to interleave calls |
to these functions by other actions. It carries its operations out in |
the following order:</para> |
<para>The stock <code>waitq_sleep_timeout</code> is then a mere wrapper |
that calls these three functions. It is provided for convenience in |
cases where the caller doesn't require such a low level control. |
However, the implementation of <code>condvar_wait_timeout</code> does |
need this finer-grained control because it has to interleave calls to |
these functions by other actions. It carries its operations out in the |
following order:</para> |
|
<orderedlist> |
<listitem> |
<para>calls <emphasis>waitq_sleep_prepare</emphasis> in order to |
lock the condition variable's wait queue,</para> |
<para>calls <code>waitq_sleep_prepare</code> in order to lock the |
condition variable's wait queue,</para> |
</listitem> |
|
<listitem> |
406,7 → 396,7 |
</listitem> |
|
<listitem> |
<para>calls <emphasis>waitq_sleep_timeout_unsafe</emphasis>,</para> |
<para>calls <code>waitq_sleep_timeout_unsafe</code>,</para> |
</listitem> |
|
<listitem> |
414,7 → 404,7 |
</listitem> |
|
<listitem> |
<para>calls <emphasis>waitq_sleep_finish</emphasis>.</para> |
<para>calls <code>waitq_sleep_finish</code>.</para> |
</listitem> |
</orderedlist> |
</section> |