Subversion Repositories HelenOS-doc

Rev

Rev 62 | Rev 73 | Go to most recent revision | Only display areas with differences | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed

Rev 62 Rev 72
1
<?xml version="1.0" encoding="UTF-8"?>
1
<?xml version="1.0" encoding="UTF-8"?>
2
<chapter id="sync">
2
<chapter id="sync">
3
  <?dbhtml filename="sync.html"?>
3
  <?dbhtml filename="sync.html"?>
4
 
4
 
5
  <title>Mutual exclusion and synchronization</title>
5
  <title>Mutual exclusion and synchronization</title>
6
 
6
 
7
  <section>
7
  <section>
8
    <title>Introduction</title>
8
    <title>Introduction</title>
9
 
9
 
10
    <para>The HelenOS operating system is designed to make use of the
10
    <para>The HelenOS operating system is designed to make use of the
11
    parallelism offered by the hardware and to exploit concurrency of both the
11
    parallelism offered by the hardware and to exploit concurrency of both the
12
    kernel and userspace tasks. This is achieved through multiprocessor
12
    kernel and userspace tasks. This is achieved through multiprocessor
13
    support and several levels of multiprogramming such as multitasking,
13
    support and several levels of multiprogramming such as multitasking,
14
    multithreading and also through userspace pseudo threads. However, such a
14
    multithreading and also through userspace pseudo threads. However, such a
15
    highly concurrent environment needs safe and efficient ways to handle
15
    highly concurrent environment needs safe and efficient ways to handle
16
    mutual exclusion and synchronization of many execution flows.</para>
16
    mutual exclusion and synchronization of many execution flows.</para>
17
  </section>
17
  </section>
18
 
18
 
19
  <section>
19
  <section>
20
    <title>Active kernel primitives</title>
20
    <title>Active kernel primitives</title>
21
 
21
 
22
    <section>
22
    <section>
-
 
23
      <indexterm>
-
 
24
        <primary>synchronization</primary>
-
 
25
 
-
 
26
        <secondary>- spinlock</secondary>
-
 
27
      </indexterm>
-
 
28
 
23
      <title>Spinlocks</title>
29
      <title>Spinlocks</title>
24
 
30
 
25
      <para>The basic mutual exclusion primitive is the spinlock. The spinlock
31
      <para>The basic mutual exclusion primitive is the spinlock. The spinlock
26
      implements active waiting for the availability of a memory lock (i.e.
32
      implements active waiting for the availability of a memory lock (i.e.
27
      simple variable) in a multiprocessor-safe manner. This safety is
33
      simple variable) in a multiprocessor-safe manner. This safety is
28
      achieved through the use of a specialized, architecture-dependent,
34
      achieved through the use of a specialized, architecture-dependent,
29
      atomic test-and-set operation which either locks the spinlock (i.e. sets
35
      atomic test-and-set operation which either locks the spinlock (i.e. sets
30
      the variable) or, provided that it is already locked, leaves it
36
      the variable) or, provided that it is already locked, leaves it
31
      unaltered. In any case, the test-and-set operation returns a value, thus
37
      unaltered. In any case, the test-and-set operation returns a value, thus
32
      signalling either success (i.e. zero return value) or failure (i.e.
38
      signalling either success (i.e. zero return value) or failure (i.e.
33
      non-zero value) in acquiring the lock. Note that this makes a
39
      non-zero value) in acquiring the lock. Note that this makes a
34
      fundamental difference between the naive algorithm that doesn't use the
40
      fundamental difference between the naive algorithm that doesn't use the
35
      atomic operation and the spinlock algortihm. While the naive algorithm
41
      atomic operation and the spinlock algortihm. While the naive algorithm
36
      is prone to race conditions on SMP configurations and thus is completely
42
      is prone to race conditions on SMP configurations and thus is completely
37
      SMP-unsafe, the spinlock algorithm eliminates the possibility of race
43
      SMP-unsafe, the spinlock algorithm eliminates the possibility of race
38
      conditions and is suitable for mutual exclusion use.</para>
44
      conditions and is suitable for mutual exclusion use.</para>
39
 
45
 
40
      <para>The semantics of the test-and-set operation is that the spinlock
46
      <para>The semantics of the test-and-set operation is that the spinlock
41
      remains unavailable until this operation called on the respective
47
      remains unavailable until this operation called on the respective
42
      spinlock returns zero. HelenOS builds two functions on top of the
48
      spinlock returns zero. HelenOS builds two functions on top of the
43
      test-and-set operation. The first function is the unconditional attempt
49
      test-and-set operation. The first function is the unconditional attempt
44
      to acquire the spinlock and is called <code>spinlock_lock</code>. It
50
      to acquire the spinlock and is called <code>spinlock_lock</code>. It
45
      simply loops until the test-and-set returns a zero value. The other
51
      simply loops until the test-and-set returns a zero value. The other
46
      function, <code>spinlock_trylock</code>, is the conditional lock
52
      function, <code>spinlock_trylock</code>, is the conditional lock
47
      operation and calls the test-and-set only once to find out whether it
53
      operation and calls the test-and-set only once to find out whether it
48
      managed to acquire the spinlock or not. The conditional operation is
54
      managed to acquire the spinlock or not. The conditional operation is
49
      useful in situations in which an algorithm cannot acquire more spinlocks
55
      useful in situations in which an algorithm cannot acquire more spinlocks
50
      in the proper order and a deadlock cannot be avoided. In such a case,
56
      in the proper order and a deadlock cannot be avoided. In such a case,
51
      the algorithm would detect the danger and instead of possibly
57
      the algorithm would detect the danger and instead of possibly
52
      deadlocking the system it would simply release some spinlocks it already
58
      deadlocking the system it would simply release some spinlocks it already
53
      holds and retry the whole operation with the hope that it will succeed
59
      holds and retry the whole operation with the hope that it will succeed
54
      next time. The unlock function, <code>spinlock_unlock</code>, is quite
60
      next time. The unlock function, <code>spinlock_unlock</code>, is quite
55
      easy - it merely clears the spinlock variable.</para>
61
      easy - it merely clears the spinlock variable.</para>
56
 
62
 
57
      <para>Nevertheless, there is a special issue related to hardware
63
      <para>Nevertheless, there is a special issue related to hardware
58
      optimizations that modern processors implement. Particularly problematic
64
      optimizations that modern processors implement. Particularly problematic
59
      is the out-of-order execution of instructions within the critical
65
      is the out-of-order execution of instructions within the critical
60
      section protected by a spinlock. The processors are always
66
      section protected by a spinlock. The processors are always
61
      self-consistent so that they can carry out speculatively executed
67
      self-consistent so that they can carry out speculatively executed
62
      instructions in the right order with regard to dependencies among those
68
      instructions in the right order with regard to dependencies among those
63
      instructions. However, the dependency between instructions inside the
69
      instructions. However, the dependency between instructions inside the
64
      critical section and those that implement locking and unlocking of the
70
      critical section and those that implement locking and unlocking of the
65
      respective spinlock is not implicit on some processor architectures. As
71
      respective spinlock is not implicit on some processor architectures. As
66
      a result, the processor needs to be explicitly told about each
72
      a result, the processor needs to be explicitly told about each
67
      occurrence of such a dependency. Therefore, HelenOS adds
73
      occurrence of such a dependency. Therefore, HelenOS adds
68
      architecture-specific hooks to all <code>spinlock_lock</code>,
74
      architecture-specific hooks to all <code>spinlock_lock</code>,
69
      <code>spinlock_trylock</code> and <code>spinlock_unlock</code> functions
75
      <code>spinlock_trylock</code> and <code>spinlock_unlock</code> functions
70
      to prevent the instructions inside the critical section from permeating
76
      to prevent the instructions inside the critical section from permeating
71
      out. On some architectures, these hooks can be void because the
77
      out. On some architectures, these hooks can be void because the
72
      dependencies are implicitly there because of the special properties of
78
      dependencies are implicitly there because of the special properties of
73
      locking and unlocking instructions. However, other architectures need to
79
      locking and unlocking instructions. However, other architectures need to
74
      instrument these hooks with different memory barriers, depending on what
80
      instrument these hooks with different memory barriers, depending on what
75
      operations could permeate out.</para>
81
      operations could permeate out.</para>
76
 
82
 
77
      <para>Spinlocks have one significant drawback: when held for longer time
83
      <para>Spinlocks have one significant drawback: when held for longer time
78
      periods, they harm both parallelism and concurrency. The processor
84
      periods, they harm both parallelism and concurrency. The processor
79
      executing <code>spinlock_lock</code> does not do any fruitful work and
85
      executing <code>spinlock_lock</code> does not do any fruitful work and
80
      is effectively halted until it can grab the lock and proceed.
86
      is effectively halted until it can grab the lock and proceed.
81
      Similarily, other execution flows cannot execute on the processor that
87
      Similarily, other execution flows cannot execute on the processor that
82
      holds the spinlock because the kernel disables preemption on that
88
      holds the spinlock because the kernel disables preemption on that
83
      processor when a spinlock is held. The reason behind disabling
89
      processor when a spinlock is held. The reason behind disabling
84
      preemption is priority inversion problem avoidance. For the same reason,
90
      preemption is priority inversion problem avoidance. For the same reason,
85
      threads are strongly discouraged from sleeping when they hold a
91
      threads are strongly discouraged from sleeping when they hold a
86
      spinlock.</para>
92
      spinlock.</para>
87
 
93
 
88
      <para>To summarize, spinlocks represent very simple and essential mutual
94
      <para>To summarize, spinlocks represent very simple and essential mutual
89
      exclusion primitive for SMP systems. On the other hand, spinlocks scale
95
      exclusion primitive for SMP systems. On the other hand, spinlocks scale
90
      poorly because of the active loop they are based on. Therefore,
96
      poorly because of the active loop they are based on. Therefore,
91
      spinlocks are used in HelenOS only for short-time mutual exclusion and
97
      spinlocks are used in HelenOS only for short-time mutual exclusion and
92
      in cases where the mutual exclusion is required out of thread context.
98
      in cases where the mutual exclusion is required out of thread context.
93
      Lastly, spinlocks are used in the construction of passive
99
      Lastly, spinlocks are used in the construction of passive
94
      synchronization primitives.</para>
100
      synchronization primitives.</para>
95
    </section>
101
    </section>
96
  </section>
102
  </section>
97
 
103
 
98
  <section>
104
  <section>
99
    <title>Passive kernel synchronization</title>
105
    <title>Passive kernel synchronization</title>
100
 
106
 
101
    <section>
107
    <section>
-
 
108
      <indexterm>
-
 
109
        <primary>synchronization</primary>
-
 
110
 
-
 
111
        <secondary>- wait queues</secondary>
-
 
112
      </indexterm>
-
 
113
 
102
      <title>Wait queues</title>
114
      <title>Wait queues</title>
103
 
115
 
104
      <para>A wait queue is the basic passive synchronization primitive on
116
      <para>A wait queue is the basic passive synchronization primitive on
105
      which all other passive synchronization primitives are built. Simply
117
      which all other passive synchronization primitives are built. Simply
106
      put, it allows a thread to sleep until an event associated with the
118
      put, it allows a thread to sleep until an event associated with the
107
      particular wait queue occurs. Multiple threads are notified about
119
      particular wait queue occurs. Multiple threads are notified about
108
      incoming events in a first come, first served fashion. Moreover, should
120
      incoming events in a first come, first served fashion. Moreover, should
109
      the event come before any thread waits for it, it is recorded in the
121
      the event come before any thread waits for it, it is recorded in the
110
      wait queue as a missed wakeup and later forwarded to the first thread
122
      wait queue as a missed wakeup and later forwarded to the first thread
111
      that decides to wait in the queue. The inner structures of the wait
123
      that decides to wait in the queue. The inner structures of the wait
112
      queue are protected by a spinlock.</para>
124
      queue are protected by a spinlock.</para>
113
 
125
 
114
      <para>The thread that wants to wait for a wait queue event uses the
126
      <para>The thread that wants to wait for a wait queue event uses the
115
      <code>waitq_sleep_timeout</code> function. The algorithm then checks the
127
      <code>waitq_sleep_timeout</code> function. The algorithm then checks the
116
      wait queue's counter of missed wakeups and if there are any missed
128
      wait queue's counter of missed wakeups and if there are any missed
117
      wakeups, the call returns immediately. The call also returns immediately
129
      wakeups, the call returns immediately. The call also returns immediately
118
      if only a conditional wait was requested. Otherwise the thread is
130
      if only a conditional wait was requested. Otherwise the thread is
119
      enqueued in the wait queue's list of sleeping threads and its state is
131
      enqueued in the wait queue's list of sleeping threads and its state is
120
      changed to <constant>Sleeping</constant>. It then sleeps until one of
132
      changed to <constant>Sleeping</constant>. It then sleeps until one of
121
      the following events happens:</para>
133
      the following events happens:</para>
122
 
134
 
123
      <orderedlist>
135
      <orderedlist>
124
        <listitem>
136
        <listitem>
125
          <para>another thread calls <code>waitq_wakeup</code> and the thread
137
          <para>another thread calls <code>waitq_wakeup</code> and the thread
126
          is the first thread in the wait queue's list of sleeping
138
          is the first thread in the wait queue's list of sleeping
127
          threads;</para>
139
          threads;</para>
128
        </listitem>
140
        </listitem>
129
 
141
 
130
        <listitem>
142
        <listitem>
131
          <para>another thread calls <code>waitq_interrupt_sleep</code> on the
143
          <para>another thread calls <code>waitq_interrupt_sleep</code> on the
132
          sleeping thread;</para>
144
          sleeping thread;</para>
133
        </listitem>
145
        </listitem>
134
 
146
 
135
        <listitem>
147
        <listitem>
136
          <para>the sleep times out provided that none of the previous
148
          <para>the sleep times out provided that none of the previous
137
          occurred within a specified time limit; the limit can be
149
          occurred within a specified time limit; the limit can be
138
          infinity.</para>
150
          infinity.</para>
139
        </listitem>
151
        </listitem>
140
      </orderedlist>
152
      </orderedlist>
141
 
153
 
142
      <para>All five possibilities (immediate return on success, immediate
154
      <para>All five possibilities (immediate return on success, immediate
143
      return on failure, wakeup after sleep, interruption and timeout) are
155
      return on failure, wakeup after sleep, interruption and timeout) are
144
      distinguishable by the return value of <code>waitq_sleep_timeout</code>.
156
      distinguishable by the return value of <code>waitq_sleep_timeout</code>.
145
      Being able to interrupt a sleeping thread is essential for externally
157
      Being able to interrupt a sleeping thread is essential for externally
146
      initiated thread termination. The ability to wait only for a certain
158
      initiated thread termination. The ability to wait only for a certain
147
      amount of time is used, for instance, to passively delay thread
159
      amount of time is used, for instance, to passively delay thread
148
      execution by several microseconds or even seconds in
160
      execution by several microseconds or even seconds in
149
      <code>thread_sleep</code> function. Due to the fact that all other
161
      <code>thread_sleep</code> function. Due to the fact that all other
150
      passive kernel synchronization primitives are based on wait queues, they
162
      passive kernel synchronization primitives are based on wait queues, they
151
      also have the option of being interrutped and, more importantly, can
163
      also have the option of being interrutped and, more importantly, can
152
      timeout. All of them also implement the conditional operation.
164
      timeout. All of them also implement the conditional operation.
153
      Furthemore, this very fundamental interface reaches up to the
165
      Furthemore, this very fundamental interface reaches up to the
154
      implementation of futexes - userspace synchronization primitive, which
166
      implementation of futexes - userspace synchronization primitive, which
155
      makes it possible for a userspace thread to request a synchronization
167
      makes it possible for a userspace thread to request a synchronization
156
      operation with a timeout or a conditional operation.</para>
168
      operation with a timeout or a conditional operation.</para>
157
 
169
 
158
      <para>From the description above, it should be apparent, that when a
170
      <para>From the description above, it should be apparent, that when a
159
      sleeping thread is woken by <code>waitq_wakeup</code> or when
171
      sleeping thread is woken by <code>waitq_wakeup</code> or when
160
      <code>waitq_sleep_timeout</code> succeeds immediately, the thread can be
172
      <code>waitq_sleep_timeout</code> succeeds immediately, the thread can be
161
      sure that the event has occurred. The thread need not and should not
173
      sure that the event has occurred. The thread need not and should not
162
      verify this fact. This approach is called direct hand-off and is
174
      verify this fact. This approach is called direct hand-off and is
163
      characteristic for all passive HelenOS synchronization primitives, with
175
      characteristic for all passive HelenOS synchronization primitives, with
164
      the exception as described below.</para>
176
      the exception as described below.</para>
165
    </section>
177
    </section>
166
 
178
 
167
    <section>
179
    <section>
-
 
180
      <indexterm>
-
 
181
        <primary>synchronization</primary>
-
 
182
 
-
 
183
        <secondary>- semaphores</secondary>
-
 
184
      </indexterm>
-
 
185
 
168
      <title>Semaphores</title>
186
      <title>Semaphores</title>
169
 
187
 
170
      <para>The interesting point about wait queues is that the number of
188
      <para>The interesting point about wait queues is that the number of
171
      missed wakeups is equal to the number of threads that will not block in
189
      missed wakeups is equal to the number of threads that will not block in
172
      <code>watiq_sleep_timeout</code> and would immediately succeed instead.
190
      <code>watiq_sleep_timeout</code> and would immediately succeed instead.
173
      On the other hand, semaphores are synchronization primitives that will
191
      On the other hand, semaphores are synchronization primitives that will
174
      let predefined amount of threads into their critical section and block
192
      let predefined amount of threads into their critical section and block
175
      any other threads above this count. However, these two cases are exactly
193
      any other threads above this count. However, these two cases are exactly
176
      the same. Semaphores in HelenOS are therefore implemented as wait queues
194
      the same. Semaphores in HelenOS are therefore implemented as wait queues
177
      with a single semantic change: their wait queue is initialized to have
195
      with a single semantic change: their wait queue is initialized to have
178
      so many missed wakeups as is the number of threads that the semphore
196
      so many missed wakeups as is the number of threads that the semphore
179
      intends to let into its critical section simultaneously.</para>
197
      intends to let into its critical section simultaneously.</para>
180
 
198
 
181
      <para>In the semaphore language, the wait queue operation
199
      <para>In the semaphore language, the wait queue operation
182
      <code>waitq_sleep_timeout</code> corresponds to semaphore
200
      <code>waitq_sleep_timeout</code> corresponds to semaphore
183
      <code>down</code> operation, represented by the function
201
      <code>down</code> operation, represented by the function
184
      <code>semaphore_down_timeout</code> and by way of similitude the wait
202
      <code>semaphore_down_timeout</code> and by way of similitude the wait
185
      queue operation waitq_wakeup corresponds to semaphore <code>up</code>
203
      queue operation waitq_wakeup corresponds to semaphore <code>up</code>
186
      operation, represented by the function <code>sempafore_up</code>. The
204
      operation, represented by the function <code>sempafore_up</code>. The
187
      conditional down operation is called
205
      conditional down operation is called
188
      <code>semaphore_trydown</code>.</para>
206
      <code>semaphore_trydown</code>.</para>
189
    </section>
207
    </section>
190
 
208
 
191
    <section>
209
    <section>
192
      <title>Mutexes</title>
210
      <title>Mutexes</title>
193
 
211
 
-
 
212
      <indexterm>
-
 
213
        <primary>synchronization</primary>
-
 
214
 
-
 
215
        <secondary>- mutex</secondary>
-
 
216
      </indexterm>
-
 
217
 
194
      <para>Mutexes are sometimes referred to as binary sempahores. That means
218
      <para>Mutexes are sometimes referred to as binary sempahores. That means
195
      that mutexes are like semaphores that allow only one thread in its
219
      that mutexes are like semaphores that allow only one thread in its
196
      critical section. Indeed, mutexes in HelenOS are implemented exactly in
220
      critical section. Indeed, mutexes in HelenOS are implemented exactly in
197
      this way: they are built on top of semaphores. From another point of
221
      this way: they are built on top of semaphores. From another point of
198
      view, they can be viewed as spinlocks without busy waiting. Their
222
      view, they can be viewed as spinlocks without busy waiting. Their
199
      semaphore heritage provides good basics for both conditional operation
223
      semaphore heritage provides good basics for both conditional operation
200
      and operation with timeout. The locking operation is called
224
      and operation with timeout. The locking operation is called
201
      <code>mutex_lock</code>, the conditional locking operation is called
225
      <code>mutex_lock</code>, the conditional locking operation is called
202
      <code>mutex_trylock</code> and the unlocking operation is called
226
      <code>mutex_trylock</code> and the unlocking operation is called
203
      <code>mutex_unlock</code>.</para>
227
      <code>mutex_unlock</code>.</para>
204
    </section>
228
    </section>
205
 
229
 
206
    <section>
230
    <section>
207
      <title>Reader/writer locks</title>
231
      <title>Reader/writer locks</title>
208
 
232
 
-
 
233
      <indexterm>
-
 
234
        <primary>synchronization</primary>
-
 
235
 
-
 
236
        <secondary>- read/write lock</secondary>
-
 
237
      </indexterm>
-
 
238
 
209
      <para>Reader/writer locks, or rwlocks, are by far the most complicated
239
      <para>Reader/writer locks, or rwlocks, are by far the most complicated
210
      synchronization primitive within the kernel. The goal of these locks is
240
      synchronization primitive within the kernel. The goal of these locks is
211
      to improve concurrency of applications, in which threads need to
241
      to improve concurrency of applications, in which threads need to
212
      synchronize access to a shared resource, and that access can be
242
      synchronize access to a shared resource, and that access can be
213
      partitioned into a read-only mode and a write mode. Reader/writer locks
243
      partitioned into a read-only mode and a write mode. Reader/writer locks
214
      should make it possible for several, possibly many, readers to enter the
244
      should make it possible for several, possibly many, readers to enter the
215
      critical section, provided that no writer is currently in the critical
245
      critical section, provided that no writer is currently in the critical
216
      section, or to be in the critical section contemporarily. Writers are
246
      section, or to be in the critical section contemporarily. Writers are
217
      allowed to enter the critical section only individually, provided that
247
      allowed to enter the critical section only individually, provided that
218
      no reader is in the critical section already. Applications, in which the
248
      no reader is in the critical section already. Applications, in which the
219
      majority of operations can be done in the read-only mode, can benefit
249
      majority of operations can be done in the read-only mode, can benefit
220
      from increased concurrency created by reader/writer locks.</para>
250
      from increased concurrency created by reader/writer locks.</para>
221
 
251
 
222
      <para>During reader/writer lock construction, a decision should be made
252
      <para>During reader/writer lock construction, a decision should be made
223
      whether readers will be prefered over writers or whether writers will be
253
      whether readers will be prefered over writers or whether writers will be
224
      prefered over readers in cases when the lock is not currently held and
254
      prefered over readers in cases when the lock is not currently held and
225
      both a reader and a writer want to gain the lock. Some operating systems
255
      both a reader and a writer want to gain the lock. Some operating systems
226
      prefer one group over the other, creating thus a possibility for
256
      prefer one group over the other, creating thus a possibility for
227
      starving the unprefered group. In the HelenOS operating system, none of
257
      starving the unprefered group. In the HelenOS operating system, none of
228
      the two groups is prefered. The lock is granted on a first come, first
258
      the two groups is prefered. The lock is granted on a first come, first
229
      served basis with the additional note that readers are granted the lock
259
      served basis with the additional note that readers are granted the lock
230
      in the biggest possible batch.</para>
260
      in the biggest possible batch.</para>
231
 
261
 
232
      <para>With this policy and the timeout modes of operation, the direct
262
      <para>With this policy and the timeout modes of operation, the direct
233
      hand-off becomes much more complicated. For instance, a writer leaving
263
      hand-off becomes much more complicated. For instance, a writer leaving
234
      the critical section must wake up all leading readers in the rwlock's
264
      the critical section must wake up all leading readers in the rwlock's
235
      wait queue or one leading writer or no-one if no thread is waiting.
265
      wait queue or one leading writer or no-one if no thread is waiting.
236
      Similarily, the last reader leaving the critical section must wakeup the
266
      Similarily, the last reader leaving the critical section must wakeup the
237
      sleeping writer if there are any sleeping threads left at all. As
267
      sleeping writer if there are any sleeping threads left at all. As
238
      another example, if a writer at the beginning of the rwlock's wait queue
268
      another example, if a writer at the beginning of the rwlock's wait queue
239
      times out and the lock is held by at least one reader, the writer which
269
      times out and the lock is held by at least one reader, the writer which
240
      has timed out must first wake up all readers that follow him in the
270
      has timed out must first wake up all readers that follow him in the
241
      queue prior to signalling the timeout itself and giving up.</para>
271
      queue prior to signalling the timeout itself and giving up.</para>
242
 
272
 
243
      <para>Due to the issues mentioned in the previous paragraph, the
273
      <para>Due to the issues mentioned in the previous paragraph, the
244
      reader/writer lock imlpementation needs to walk the rwlock wait queue's
274
      reader/writer lock imlpementation needs to walk the rwlock wait queue's
245
      list of sleeping threads directly, in order to find out the type of
275
      list of sleeping threads directly, in order to find out the type of
246
      access that the queueing threads demand. This makes the code difficult
276
      access that the queueing threads demand. This makes the code difficult
247
      to understand and dependent on the internal implementation of the wait
277
      to understand and dependent on the internal implementation of the wait
248
      queue. Nevertheless, it remains unclear to the authors of HelenOS
278
      queue. Nevertheless, it remains unclear to the authors of HelenOS
249
      whether a simpler but equivalently fair solution exists.</para>
279
      whether a simpler but equivalently fair solution exists.</para>
250
 
280
 
251
      <para>The implementation of rwlocks as it has been already put, makes
281
      <para>The implementation of rwlocks as it has been already put, makes
252
      use of one single wait queue for both readers and writers, thus avoiding
282
      use of one single wait queue for both readers and writers, thus avoiding
253
      any possibility of starvation. In fact, rwlocks use a mutex rather than
283
      any possibility of starvation. In fact, rwlocks use a mutex rather than
254
      a bare wait queue. This mutex is called <code>exclusive</code> and is
284
      a bare wait queue. This mutex is called <code>exclusive</code> and is
255
      used to synchronize writers. The writer's lock operation,
285
      used to synchronize writers. The writer's lock operation,
256
      <code>rwlock_write_lock_timeout</code>, simply tries to acquire the
286
      <code>rwlock_write_lock_timeout</code>, simply tries to acquire the
257
      exclusive mutex. If it succeeds, the writer is granted the rwlock.
287
      exclusive mutex. If it succeeds, the writer is granted the rwlock.
258
      However, if the operation fails (e.g. times out), the writer must check
288
      However, if the operation fails (e.g. times out), the writer must check
259
      for potential readers at the head of the list of sleeping threads
289
      for potential readers at the head of the list of sleeping threads
260
      associated with the mutex's wait queue and then proceed according to the
290
      associated with the mutex's wait queue and then proceed according to the
261
      procedure outlined above.</para>
291
      procedure outlined above.</para>
262
 
292
 
263
      <para>The exclusive mutex plays an important role in reader
293
      <para>The exclusive mutex plays an important role in reader
264
      synchronization as well. However, a reader doing the reader's lock
294
      synchronization as well. However, a reader doing the reader's lock
265
      operation, <code>rwlock_read_lock_timeout</code>, may bypass this mutex
295
      operation, <code>rwlock_read_lock_timeout</code>, may bypass this mutex
266
      when it detects that:</para>
296
      when it detects that:</para>
267
 
297
 
268
      <orderedlist>
298
      <orderedlist>
269
        <listitem>
299
        <listitem>
270
          <para>there are other readers in the critical section and</para>
300
          <para>there are other readers in the critical section and</para>
271
        </listitem>
301
        </listitem>
272
 
302
 
273
        <listitem>
303
        <listitem>
274
          <para>there are no sleeping threads waiting for the exclusive
304
          <para>there are no sleeping threads waiting for the exclusive
275
          mutex.</para>
305
          mutex.</para>
276
        </listitem>
306
        </listitem>
277
      </orderedlist>
307
      </orderedlist>
278
 
308
 
279
      <para>If both conditions are true, the reader will bypass the mutex,
309
      <para>If both conditions are true, the reader will bypass the mutex,
280
      increment the number of readers in the critical section and then enter
310
      increment the number of readers in the critical section and then enter
281
      the critical section. Note that if there are any sleeping threads at the
311
      the critical section. Note that if there are any sleeping threads at the
282
      beginning of the wait queue, the first must be a writer. If the
312
      beginning of the wait queue, the first must be a writer. If the
283
      conditions are not fulfilled, the reader normally waits until the
313
      conditions are not fulfilled, the reader normally waits until the
284
      exclusive mutex is granted to it.</para>
314
      exclusive mutex is granted to it.</para>
285
    </section>
315
    </section>
286
 
316
 
287
    <section>
317
    <section>
288
      <title>Condition variables</title>
318
      <title>Condition variables</title>
289
 
319
 
-
 
320
      <indexterm>
-
 
321
        <primary>synchronization</primary>
-
 
322
 
-
 
323
        <secondary>- condition variable</secondary>
-
 
324
      </indexterm>
-
 
325
 
290
      <para>Condition variables can be used for waiting until a condition
326
      <para>Condition variables can be used for waiting until a condition
291
      becomes true. In this respect, they are similar to wait queues. But
327
      becomes true. In this respect, they are similar to wait queues. But
292
      contrary to wait queues, condition variables have different semantics
328
      contrary to wait queues, condition variables have different semantics
293
      that allows events to be lost when there is no thread waiting for them.
329
      that allows events to be lost when there is no thread waiting for them.
294
      In order to support this, condition variables don't use direct hand-off
330
      In order to support this, condition variables don't use direct hand-off
295
      and operate in a way similar to the example below. A thread waiting for
331
      and operate in a way similar to the example below. A thread waiting for
296
      the condition becoming true does the following:</para>
332
      the condition becoming true does the following:</para>
297
 
333
 
298
      <example>
334
      <example>
299
      <title>Use of <code>condvar_wait_timeout</code>.</title>
335
        <title>Use of <code>condvar_wait_timeout</code>.</title>
-
 
336
 
300
      <programlisting language="C"><function>mutex_lock</function>(<varname>mtx</varname>);
337
        <programlisting language="C"><function>mutex_lock</function>(<varname>mtx</varname>);
301
while (!<varname>condition</varname>)
338
while (!<varname>condition</varname>)
302
        <function>condvar_wait_timeout</function>(<varname>cv</varname>, <varname>mtx</varname>);
339
        <function>condvar_wait_timeout</function>(<varname>cv</varname>, <varname>mtx</varname>);
303
/* <remark>the condition is true, do something</remark> */
340
/* <remark>the condition is true, do something</remark> */
304
<function>mutex_unlock</function>(<varname>mtx</varname>);</programlisting>
341
<function>mutex_unlock</function>(<varname>mtx</varname>);</programlisting>
305
    </example>
342
      </example>
306
 
343
 
307
      <para>A thread that causes the condition become true signals this event
344
      <para>A thread that causes the condition become true signals this event
308
      like this:</para>
345
      like this:</para>
309
 
346
 
310
      <example>
347
      <example>
311
      <title>Use of <code>condvar_signal</code>.</title>
348
        <title>Use of <code>condvar_signal</code>.</title>
-
 
349
 
312
      <programlisting><function>mutex_lock</function>(<varname>mtx</varname>);
350
        <programlisting><function>mutex_lock</function>(<varname>mtx</varname>);
313
<varname>condition</varname> = <constant>true</constant>;
351
<varname>condition</varname> = <constant>true</constant>;
314
<function>condvar_signal</function>(<varname>cv</varname>);  /* <remark>condvar_broadcast(cv);</remark> */
352
<function>condvar_signal</function>(<varname>cv</varname>);  /* <remark>condvar_broadcast(cv);</remark> */
315
<function>mutex_unlock</function>(<varname>mtx</varname>);</programlisting></example>
353
<function>mutex_unlock</function>(<varname>mtx</varname>);</programlisting>
-
 
354
      </example>
316
 
355
 
317
      <para>The wait operation, <code>condvar_wait_timeout</code>, always puts
356
      <para>The wait operation, <code>condvar_wait_timeout</code>, always puts
318
      the calling thread to sleep. The thread then sleeps until another thread
357
      the calling thread to sleep. The thread then sleeps until another thread
319
      invokes <code>condvar_broadcast</code> on the same condition variable or
358
      invokes <code>condvar_broadcast</code> on the same condition variable or
320
      until it is woken up by <code>condvar_signal</code>. The
359
      until it is woken up by <code>condvar_signal</code>. The
321
      <code>condvar_signal</code> operation unblocks the first thread blocking
360
      <code>condvar_signal</code> operation unblocks the first thread blocking
322
      on the condition variable while the <code>condvar_broadcast</code>
361
      on the condition variable while the <code>condvar_broadcast</code>
323
      operation unblocks all threads blocking there. If there are no blocking
362
      operation unblocks all threads blocking there. If there are no blocking
324
      threads, these two operations have no efect.</para>
363
      threads, these two operations have no efect.</para>
325
 
364
 
326
      <para>Note that the threads must synchronize over a dedicated mutex. To
365
      <para>Note that the threads must synchronize over a dedicated mutex. To
327
      prevent race condition between <code>condvar_wait_timeout</code> and
366
      prevent race condition between <code>condvar_wait_timeout</code> and
328
      <code>condvar_signal</code> or <code>condvar_broadcast</code>, the mutex
367
      <code>condvar_signal</code> or <code>condvar_broadcast</code>, the mutex
329
      is passed to <code>condvar_wait_timeout</code> which then atomically
368
      is passed to <code>condvar_wait_timeout</code> which then atomically
330
      puts the calling thread asleep and unlocks the mutex. When the thread
369
      puts the calling thread asleep and unlocks the mutex. When the thread
331
      eventually wakes up, <code>condvar_wait</code> regains the mutex and
370
      eventually wakes up, <code>condvar_wait</code> regains the mutex and
332
      returns.</para>
371
      returns.</para>
333
 
372
 
334
      <para>Also note, that there is no conditional operation for condition
373
      <para>Also note, that there is no conditional operation for condition
335
      variables. Such an operation would make no sence since condition
374
      variables. Such an operation would make no sence since condition
336
      variables are defined to forget events for which there is no waiting
375
      variables are defined to forget events for which there is no waiting
337
      thread and because <code>condvar_wait</code> must always go to sleep.
376
      thread and because <code>condvar_wait</code> must always go to sleep.
338
      The operation with timeout is supported as usually.</para>
377
      The operation with timeout is supported as usually.</para>
339
 
378
 
340
      <para>In HelenOS, condition variables are based on wait queues. As it is
379
      <para>In HelenOS, condition variables are based on wait queues. As it is
341
      already mentioned above, wait queues remember missed events while
380
      already mentioned above, wait queues remember missed events while
342
      condition variables must not do so. This is reasoned by the fact that
381
      condition variables must not do so. This is reasoned by the fact that
343
      condition variables are designed for scenarios in which an event might
382
      condition variables are designed for scenarios in which an event might
344
      occur very many times without being picked up by any waiting thread. On
383
      occur very many times without being picked up by any waiting thread. On
345
      the other hand, wait queues would remember any event that had not been
384
      the other hand, wait queues would remember any event that had not been
346
      picked up by a call to <code>waitq_sleep_timeout</code>. Therefore, if
385
      picked up by a call to <code>waitq_sleep_timeout</code>. Therefore, if
347
      wait queues were used directly and without any changes to implement
386
      wait queues were used directly and without any changes to implement
348
      condition variables, the missed_wakeup counter would hurt performance of
387
      condition variables, the missed_wakeup counter would hurt performance of
349
      the implementation: the <code>while</code> loop in
388
      the implementation: the <code>while</code> loop in
350
      <code>condvar_wait_timeout</code> would effectively do busy waiting
389
      <code>condvar_wait_timeout</code> would effectively do busy waiting
351
      until all missed wakeups were discarded.</para>
390
      until all missed wakeups were discarded.</para>
352
 
391
 
353
      <para>The requirement on the wait operation to atomically put the caller
392
      <para>The requirement on the wait operation to atomically put the caller
354
      to sleep and release the mutex poses an interesting problem on
393
      to sleep and release the mutex poses an interesting problem on
355
      <code>condvar_wait_timeout</code>. More precisely, the thread should
394
      <code>condvar_wait_timeout</code>. More precisely, the thread should
356
      sleep in the condvar's wait queue prior to releasing the mutex, but it
395
      sleep in the condvar's wait queue prior to releasing the mutex, but it
357
      must not hold the mutex when it is sleeping.</para>
396
      must not hold the mutex when it is sleeping.</para>
358
 
397
 
359
      <para>Problems described in the two previous paragraphs are addressed in
398
      <para>Problems described in the two previous paragraphs are addressed in
360
      HelenOS by dividing the <code>waitq_sleep_timeout</code> function into
399
      HelenOS by dividing the <code>waitq_sleep_timeout</code> function into
361
      three pieces:</para>
400
      three pieces:</para>
362
 
401
 
363
      <orderedlist>
402
      <orderedlist>
364
        <listitem>
403
        <listitem>
365
          <para><code>waitq_sleep_prepare</code> prepares the thread to go to
404
          <para><code>waitq_sleep_prepare</code> prepares the thread to go to
366
          sleep by, among other things, locking the wait queue;</para>
405
          sleep by, among other things, locking the wait queue;</para>
367
        </listitem>
406
        </listitem>
368
 
407
 
369
        <listitem>
408
        <listitem>
370
          <para><code>waitq_sleep_timeout_unsafe</code> implements the core
409
          <para><code>waitq_sleep_timeout_unsafe</code> implements the core
371
          blocking logic;</para>
410
          blocking logic;</para>
372
        </listitem>
411
        </listitem>
373
 
412
 
374
        <listitem>
413
        <listitem>
375
          <para><code>waitq_sleep_finish</code> performs cleanup after
414
          <para><code>waitq_sleep_finish</code> performs cleanup after
376
          <code>waitq_sleep_timeout_unsafe</code>; after this call, the wait
415
          <code>waitq_sleep_timeout_unsafe</code>; after this call, the wait
377
          queue spinlock is guaranteed to be unlocked by the caller</para>
416
          queue spinlock is guaranteed to be unlocked by the caller</para>
378
        </listitem>
417
        </listitem>
379
      </orderedlist>
418
      </orderedlist>
380
 
419
 
381
      <para>The stock <code>waitq_sleep_timeout</code> is then a mere wrapper
420
      <para>The stock <code>waitq_sleep_timeout</code> is then a mere wrapper
382
      that calls these three functions. It is provided for convenience in
421
      that calls these three functions. It is provided for convenience in
383
      cases where the caller doesn't require such a low level control.
422
      cases where the caller doesn't require such a low level control.
384
      However, the implementation of <code>condvar_wait_timeout</code> does
423
      However, the implementation of <code>condvar_wait_timeout</code> does
385
      need this finer-grained control because it has to interleave calls to
424
      need this finer-grained control because it has to interleave calls to
386
      these functions by other actions. It carries its operations out in the
425
      these functions by other actions. It carries its operations out in the
387
      following order:</para>
426
      following order:</para>
388
 
427
 
389
      <orderedlist>
428
      <orderedlist>
390
        <listitem>
429
        <listitem>
391
          <para>calls <code>waitq_sleep_prepare</code> in order to lock the
430
          <para>calls <code>waitq_sleep_prepare</code> in order to lock the
392
          condition variable's wait queue,</para>
431
          condition variable's wait queue,</para>
393
        </listitem>
432
        </listitem>
394
 
433
 
395
        <listitem>
434
        <listitem>
396
          <para>releases the mutex,</para>
435
          <para>releases the mutex,</para>
397
        </listitem>
436
        </listitem>
398
 
437
 
399
        <listitem>
438
        <listitem>
400
          <para>clears the counter of missed wakeups,</para>
439
          <para>clears the counter of missed wakeups,</para>
401
        </listitem>
440
        </listitem>
402
 
441
 
403
        <listitem>
442
        <listitem>
404
          <para>calls <code>waitq_sleep_timeout_unsafe</code>,</para>
443
          <para>calls <code>waitq_sleep_timeout_unsafe</code>,</para>
405
        </listitem>
444
        </listitem>
406
 
445
 
407
        <listitem>
446
        <listitem>
408
          <para>retakes the mutex,</para>
447
          <para>retakes the mutex,</para>
409
        </listitem>
448
        </listitem>
410
 
449
 
411
        <listitem>
450
        <listitem>
412
          <para>calls <code>waitq_sleep_finish</code>.</para>
451
          <para>calls <code>waitq_sleep_finish</code>.</para>
413
        </listitem>
452
        </listitem>
414
      </orderedlist>
453
      </orderedlist>
415
    </section>
454
    </section>
416
  </section>
455
  </section>
417
 
456
 
418
  <section>
457
  <section>
419
    <title>Userspace synchronization</title>
458
    <title>Userspace synchronization</title>
420
 
459
 
421
    <section>
460
    <section>
422
      <title>Futexes</title>
461
      <title>Futexes</title>
423
 
462
 
-
 
463
      <indexterm>
-
 
464
        <primary>synchronization</primary>
-
 
465
 
-
 
466
        <secondary>- futex</secondary>
-
 
467
      </indexterm>
-
 
468
 
424
      <para></para>
469
      <para></para>
425
    </section>
470
    </section>
426
  </section>
471
  </section>
427
</chapter>
472
</chapter>