Subversion Repositories HelenOS-doc

Rev

Rev 44 | Rev 48 | Go to most recent revision | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
9 bondari 1
<?xml version="1.0" encoding="UTF-8"?>
41 jermar 2
<chapter id="sync">
3
  <?dbhtml filename="sync.html"?>
9 bondari 4
 
41 jermar 5
  <title>Mutual exclusion and synchronization</title>
9 bondari 6
 
41 jermar 7
  <section>
8
    <title>Introduction</title>
9 bondari 9
 
45 jermar 10
    <para>The HelenOS operating system is designed to make use of the
11
    parallelism offered by the hardware and to exploit concurrency of both the
12
    kernel and userspace tasks. This is achieved through multiprocessor
13
    support and several levels of multiprogramming such as multitasking,
14
    multithreading and also through userspace pseudo threads. However, such a
15
    highly concurrent environment needs safe and efficient ways to handle
16
    mutual exclusion and synchronization of many execution flows.</para>
41 jermar 17
  </section>
18
 
19
  <section>
20
    <title>Active kernel primitives</title>
21
 
9 bondari 22
    <section>
41 jermar 23
      <title>Spinlocks</title>
9 bondari 24
 
45 jermar 25
      <para>The basic mutual exclusion primitive is the spinlock. The spinlock
26
      implements active waiting for the availability of a memory lock (i.e.
41 jermar 27
      simple variable) in a multiprocessor-safe manner. This safety is
28
      achieved through the use of a specialized, architecture-dependent,
29
      atomic test-and-set operation which either locks the spinlock (i.e. sets
30
      the variable) or, provided that it is already locked, leaves it
31
      unaltered. In any case, the test-and-set operation returns a value, thus
32
      signalling either success (i.e. zero return value) or failure (i.e.
45 jermar 33
      non-zero value) in acquiring the lock. Note that this makes a
41 jermar 34
      fundamental difference between the naive algorithm that doesn't use the
35
      atomic operation and the spinlock algortihm. While the naive algorithm
45 jermar 36
      is prone to race conditions on SMP configurations and thus is completely
41 jermar 37
      SMP-unsafe, the spinlock algorithm eliminates the possibility of race
38
      conditions and is suitable for mutual exclusion use.</para>
9 bondari 39
 
41 jermar 40
      <para>The semantics of the test-and-set operation is that the spinlock
41
      remains unavailable until this operation called on the respective
45 jermar 42
      spinlock returns zero. HelenOS builds two functions on top of the
43
      test-and-set operation. The first function is the unconditional attempt
44
      to acquire the spinlock and is called
45
      <emphasis>spinlock_lock</emphasis>. It simply loops until the
46
      test-and-set returns a zero value. The other function,
41 jermar 47
      <emphasis>spinlock_trylock</emphasis>, is the conditional lock operation
45 jermar 48
      and calls the test-and-set only once to find out whether it managed to
41 jermar 49
      acquire the spinlock or not. The conditional operation is useful in
50
      situations in which an algorithm cannot acquire more spinlocks in the
51
      proper order and a deadlock cannot be avoided. In such a case, the
52
      algorithm would detect the danger and instead of possibly deadlocking
53
      the system it would simply release some spinlocks it already holds and
54
      retry the whole operation with the hope that it will succeed next time.
45 jermar 55
      The unlock function, <emphasis>spinlock_unlock</emphasis>, is quite easy
56
      - it merely clears the spinlock variable.</para>
9 bondari 57
 
41 jermar 58
      <para>Nevertheless, there is a special issue related to hardware
45 jermar 59
      optimizations that modern processors implement. Particularly problematic
60
      is the out-of-order execution of instructions within the critical
61
      section protected by a spinlock. The processors are always
41 jermar 62
      self-consistent so that they can carry out speculatively executed
63
      instructions in the right order with regard to dependencies among those
64
      instructions. However, the dependency between instructions inside the
65
      critical section and those that implement locking and unlocking of the
45 jermar 66
      respective spinlock is not implicit on some processor architectures. As
67
      a result, the processor needs to be explicitly told about each
68
      occurrence of such a dependency. Therefore, HelenOS adds
69
      architecture-specific hooks to all <emphasis>spinlock_lock</emphasis>,
41 jermar 70
      <emphasis>spinlock_trylock</emphasis> and
45 jermar 71
      <emphasis>spinlock_unlock</emphasis> functions to prevent the
72
      instructions inside the critical section from permeating out. On some
73
      architectures, these hooks can be void because the dependencies are
74
      implicitly there because of the special properties of locking and
75
      unlocking instructions. However, other architectures need to instrument
76
      these hooks with different memory barriers, depending on what operations
77
      could permeate out.</para>
9 bondari 78
 
41 jermar 79
      <para>Spinlocks have one significant drawback: when held for longer time
45 jermar 80
      periods, they harm both parallelism and concurrency. The processor
81
      executing <emphasis>spinlock_lock</emphasis> does not do any fruitful
82
      work and is effectively halted until it can grab the lock and proceed.
83
      Similarily, other execution flows cannot execute on the processor that
84
      holds the spinlock because the kernel disables preemption on that
85
      processor when a spinlock is held. The reason behind disabling
86
      preemption is priority inversion problem avoidance. For the same reason,
87
      threads are strongly discouraged from sleeping when they hold a
88
      spinlock.</para>
9 bondari 89
 
41 jermar 90
      <para>To summarize, spinlocks represent very simple and essential mutual
91
      exclusion primitive for SMP systems. On the other hand, spinlocks scale
92
      poorly because of the active loop they are based on. Therefore,
45 jermar 93
      spinlocks are used in HelenOS only for short-time mutual exclusion and
41 jermar 94
      in cases where the mutual exclusion is required out of thread context.
95
      Lastly, spinlocks are used in the construction of passive
96
      synchronization primitives.</para>
97
    </section>
98
  </section>
9 bondari 99
 
41 jermar 100
  <section>
101
    <title>Passive kernel synchronization</title>
9 bondari 102
 
41 jermar 103
    <section>
43 jermar 104
      <title>Wait queues</title>
9 bondari 105
 
43 jermar 106
      <para>A wait queue is the basic passive synchronization primitive on
45 jermar 107
      which all other passive synchronization primitives are built. Simply
108
      put, it allows a thread to sleep until an event associated with the
109
      particular wait queue occurs. Multiple threads are notified about
110
      incoming events in a first come, first served fashion. Moreover, should
111
      the event come before any thread waits for it, it is recorded in the
112
      wait queue as a missed wakeup and later forwarded to the first thread
113
      that decides to wait in the queue. The inner structures of the wait
114
      queue are protected by a spinlock.</para>
43 jermar 115
 
116
      <para>The thread that wants to wait for a wait queue event uses the
117
      <emphasis>waitq_sleep_timeout</emphasis> function. The algorithm then
118
      checks the wait queue's counter of missed wakeups and if there are any
119
      missed wakeups, the call returns immediately. The call also returns
120
      immediately if only a conditional wait was requested. Otherwise the
121
      thread is enqueued in the wait queue's list of sleeping threads and its
122
      state is changed to <emphasis>Sleeping</emphasis>. It then sleeps until
123
      one of the following events happens:</para>
124
 
125
      <orderedlist>
126
        <listitem>
127
          <para>another thread calls <emphasis>waitq_wakeup</emphasis> and the
128
          thread is the first thread in the wait queue's list of sleeping
45 jermar 129
          threads;</para>
43 jermar 130
        </listitem>
131
 
132
        <listitem>
133
          <para>another thread calls
134
          <emphasis>waitq_interrupt_sleep</emphasis> on the sleeping
45 jermar 135
          thread;</para>
43 jermar 136
        </listitem>
137
 
138
        <listitem>
45 jermar 139
          <para>the sleep times out provided that none of the previous
140
          occurred within a specified time limit; the limit can be
141
          infinity.</para>
43 jermar 142
        </listitem>
143
      </orderedlist>
144
 
145
      <para>All five possibilities (immediate return on success, immediate
146
      return on failure, wakeup after sleep, interruption and timeout) are
147
      distinguishable by the return value of
45 jermar 148
      <emphasis>waitq_sleep_timeout</emphasis>. Being able to interrupt a
149
      sleeping thread is essential for externally initiated thread
150
      termination. The ability to wait only for a certain amount of time is
151
      used, for instance, to passively delay thread execution by several
152
      microseconds or even seconds in <emphasis>thread_sleep</emphasis>
153
      function. Due to the fact that all other passive kernel synchronization
154
      primitives are based on wait queues, they also have the option of being
155
      interrutped and, more importantly, can timeout. All of them also
156
      implement the conditional operation. Furthemore, this very fundamental
157
      interface reaches up to the implementation of futexes - userspace
158
      synchronization primitive, which makes it possible for a userspace
159
      thread to request a synchronization operation with a timeout or a
160
      conditional operation.</para>
43 jermar 161
 
162
      <para>From the description above, it should be apparent, that when a
163
      sleeping thread is woken by <emphasis>waitq_wakeup</emphasis> or when
45 jermar 164
      <emphasis>waitq_sleep_timeout</emphasis> succeeds immediately, the
165
      thread can be sure that the event has occurred. The thread need not and
166
      should not verify this fact. This approach is called direct hand-off and
167
      is characteristic for all passive HelenOS synchronization primitives,
168
      with the exception as described below.</para>
41 jermar 169
    </section>
9 bondari 170
 
41 jermar 171
    <section>
172
      <title>Semaphores</title>
9 bondari 173
 
43 jermar 174
      <para>The interesting point about wait queues is that the number of
175
      missed wakeups is equal to the number of threads that will not block in
176
      <emphasis>watiq_sleep_timeout</emphasis> and would immediately succeed
177
      instead. On the other hand, semaphores are synchronization primitives
45 jermar 178
      that will let predefined amount of threads into its critical section and
43 jermar 179
      block any other threads above this count. However, these two cases are
180
      exactly the same. Semaphores in HelenOS are therefore implemented as
181
      wait queues with a single semantic change: their wait queue is
182
      initialized to have so many missed wakeups as is the number of threads
183
      that the semphore intends to let into its critical section
184
      simultaneously.</para>
185
 
186
      <para>In the semaphore language, the wait queue operation
187
      <emphasis>waitq_sleep_timeout</emphasis> corresponds to
188
      <emphasis><emphasis>semaphore</emphasis> down</emphasis> operation,
189
      represented by the function <emphasis>semaphore_down_timeout</emphasis>
190
      and by way of similitude the wait queue operation waitq_wakeup
191
      corresponds to semaphore <emphasis>up</emphasis> operation, represented
192
      by the function <emphasis>sempafore_up</emphasis>. The conditional down
193
      operation is called <emphasis>semaphore_trydown</emphasis>.</para>
41 jermar 194
    </section>
9 bondari 195
 
41 jermar 196
    <section>
43 jermar 197
      <title>Mutexes</title>
9 bondari 198
 
45 jermar 199
      <para>Mutexes are sometimes referred to as binary sempahores. That means
200
      that mutexes are like semaphores that allow only one thread in its
43 jermar 201
      critical section. Indeed, mutexes in HelenOS are implemented exactly in
45 jermar 202
      this way: they are built on top of semaphores. From another point of
203
      view, they can be viewed as spinlocks without busy waiting. Their
204
      semaphore heritage provides good basics for both conditional operation
205
      and operation with timeout. The locking operation is called
43 jermar 206
      <emphasis>mutex_lock</emphasis>, the conditional locking operation is
207
      called <emphasis>mutex_trylock</emphasis> and the unlocking operation is
208
      called <emphasis>mutex_unlock</emphasis>.</para>
41 jermar 209
    </section>
9 bondari 210
 
41 jermar 211
    <section>
43 jermar 212
      <title>Reader/writer locks</title>
9 bondari 213
 
43 jermar 214
      <para>Reader/writer locks, or rwlocks, are by far the most complicated
215
      synchronization primitive within the kernel. The goal of these locks is
45 jermar 216
      to improve concurrency of applications, in which threads need to
217
      synchronize access to a shared resource, and that access can be
43 jermar 218
      partitioned into a read-only mode and a write mode. Reader/writer locks
219
      should make it possible for several, possibly many, readers to enter the
220
      critical section, provided that no writer is currently in the critical
221
      section, or to be in the critical section contemporarily. Writers are
222
      allowed to enter the critical section only individually, provided that
45 jermar 223
      no reader is in the critical section already. Applications, in which the
224
      majority of operations can be done in the read-only mode, can benefit
43 jermar 225
      from increased concurrency created by reader/writer locks.</para>
226
 
45 jermar 227
      <para>During reader/writer lock construction, a decision should be made
43 jermar 228
      whether readers will be prefered over writers or whether writers will be
229
      prefered over readers in cases when the lock is not currently held and
230
      both a reader and a writer want to gain the lock. Some operating systems
231
      prefer one group over the other, creating thus a possibility for
232
      starving the unprefered group. In the HelenOS operating system, none of
45 jermar 233
      the two groups is prefered. The lock is granted on a first come, first
43 jermar 234
      served basis with the additional note that readers are granted the lock
45 jermar 235
      in the biggest possible batch.</para>
43 jermar 236
 
237
      <para>With this policy and the timeout modes of operation, the direct
238
      hand-off becomes much more complicated. For instance, a writer leaving
239
      the critical section must wake up all leading readers in the rwlock's
240
      wait queue or one leading writer or no-one if no thread is waiting.
241
      Similarily, the last reader leaving the critical section must wakeup the
45 jermar 242
      sleeping writer if there are any sleeping threads left at all. As
243
      another example, if a writer at the beginning of the rwlock's wait queue
244
      times out and the lock is held by at least one reader, the writer which
245
      has timed out must first wake up all readers that follow him in the
246
      queue prior to signalling the timeout itself and giving up.</para>
43 jermar 247
 
45 jermar 248
      <para>Due to the issues mentioned in the previous paragraph, the
249
      reader/writer lock imlpementation needs to walk the rwlock wait queue's
250
      list of sleeping threads directly, in order to find out the type of
43 jermar 251
      access that the queueing threads demand. This makes the code difficult
252
      to understand and dependent on the internal implementation of the wait
253
      queue. Nevertheless, it remains unclear to the authors of HelenOS
254
      whether a simpler but equivalently fair solution exists.</para>
255
 
256
      <para>The implementation of rwlocks as it has been already put, makes
257
      use of one single wait queue for both readers and writers, thus avoiding
258
      any possibility of starvation. In fact, rwlocks use a mutex rather than
259
      a bare wait queue. This mutex is called <emphasis>exclusive</emphasis>
260
      and is used to synchronize writers. The writer's lock operation,
261
      <emphasis>rwlock_write_lock_timeout</emphasis>, simply tries to acquire
262
      the exclusive mutex. If it succeeds, the writer is granted the rwlock.
44 jermar 263
      However, if the operation fails (e.g. times out), the writer must check
264
      for potential readers at the head of the list of sleeping threads
45 jermar 265
      associated with the mutex's wait queue and then proceed according to the
44 jermar 266
      procedure outlined above.</para>
43 jermar 267
 
268
      <para>The exclusive mutex plays an important role in reader
269
      synchronization as well. However, a reader doing the reader's lock
270
      operation, <emphasis>rwlock_read_lock_timeout</emphasis>, may bypass
271
      this mutex when it detects that:</para>
272
 
273
      <orderedlist>
274
        <listitem>
45 jermar 275
          <para>there are other readers in the critical section and</para>
43 jermar 276
        </listitem>
277
 
278
        <listitem>
279
          <para>there are no sleeping threads waiting for the exclusive
45 jermar 280
          mutex.</para>
43 jermar 281
        </listitem>
282
      </orderedlist>
283
 
284
      <para>If both conditions are true, the reader will bypass the mutex,
45 jermar 285
      increment the number of readers in the critical section and then enter
286
      the critical section. Note that if there are any sleeping threads at the
287
      beginning of the wait queue, the first must be a writer. If the
43 jermar 288
      conditions are not fulfilled, the reader normally waits until the
289
      exclusive mutex is granted to it.</para>
41 jermar 290
    </section>
9 bondari 291
 
292
    <section>
41 jermar 293
      <title>Condition variables</title>
9 bondari 294
 
41 jermar 295
      <para>Condvars explanation</para>
9 bondari 296
    </section>
41 jermar 297
  </section>
9 bondari 298
 
41 jermar 299
  <section>
300
    <title>Userspace synchronization</title>
9 bondari 301
 
41 jermar 302
    <section>
303
      <title>Futexes</title>
304
 
305
      <para></para>
306
    </section>
307
  </section>
308
</chapter>