Subversion Repositories HelenOS-doc

Rev

Rev 57 | Rev 72 | Go to most recent revision | Only display areas with differences | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed

Rev 57 Rev 62
1
<?xml version="1.0" encoding="UTF-8"?>
1
<?xml version="1.0" encoding="UTF-8"?>
2
<chapter id="sync">
2
<chapter id="sync">
3
  <?dbhtml filename="sync.html"?>
3
  <?dbhtml filename="sync.html"?>
4
 
4
 
5
  <title>Mutual exclusion and synchronization</title>
5
  <title>Mutual exclusion and synchronization</title>
6
 
6
 
7
  <section>
7
  <section>
8
    <title>Introduction</title>
8
    <title>Introduction</title>
9
 
9
 
10
    <para>The HelenOS operating system is designed to make use of the
10
    <para>The HelenOS operating system is designed to make use of the
11
    parallelism offered by the hardware and to exploit concurrency of both the
11
    parallelism offered by the hardware and to exploit concurrency of both the
12
    kernel and userspace tasks. This is achieved through multiprocessor
12
    kernel and userspace tasks. This is achieved through multiprocessor
13
    support and several levels of multiprogramming such as multitasking,
13
    support and several levels of multiprogramming such as multitasking,
14
    multithreading and also through userspace pseudo threads. However, such a
14
    multithreading and also through userspace pseudo threads. However, such a
15
    highly concurrent environment needs safe and efficient ways to handle
15
    highly concurrent environment needs safe and efficient ways to handle
16
    mutual exclusion and synchronization of many execution flows.</para>
16
    mutual exclusion and synchronization of many execution flows.</para>
17
  </section>
17
  </section>
18
 
18
 
19
  <section>
19
  <section>
20
    <title>Active kernel primitives</title>
20
    <title>Active kernel primitives</title>
21
 
21
 
22
    <section>
22
    <section>
23
      <title>Spinlocks</title>
23
      <title>Spinlocks</title>
24
 
24
 
25
      <para>The basic mutual exclusion primitive is the spinlock. The spinlock
25
      <para>The basic mutual exclusion primitive is the spinlock. The spinlock
26
      implements active waiting for the availability of a memory lock (i.e.
26
      implements active waiting for the availability of a memory lock (i.e.
27
      simple variable) in a multiprocessor-safe manner. This safety is
27
      simple variable) in a multiprocessor-safe manner. This safety is
28
      achieved through the use of a specialized, architecture-dependent,
28
      achieved through the use of a specialized, architecture-dependent,
29
      atomic test-and-set operation which either locks the spinlock (i.e. sets
29
      atomic test-and-set operation which either locks the spinlock (i.e. sets
30
      the variable) or, provided that it is already locked, leaves it
30
      the variable) or, provided that it is already locked, leaves it
31
      unaltered. In any case, the test-and-set operation returns a value, thus
31
      unaltered. In any case, the test-and-set operation returns a value, thus
32
      signalling either success (i.e. zero return value) or failure (i.e.
32
      signalling either success (i.e. zero return value) or failure (i.e.
33
      non-zero value) in acquiring the lock. Note that this makes a
33
      non-zero value) in acquiring the lock. Note that this makes a
34
      fundamental difference between the naive algorithm that doesn't use the
34
      fundamental difference between the naive algorithm that doesn't use the
35
      atomic operation and the spinlock algortihm. While the naive algorithm
35
      atomic operation and the spinlock algortihm. While the naive algorithm
36
      is prone to race conditions on SMP configurations and thus is completely
36
      is prone to race conditions on SMP configurations and thus is completely
37
      SMP-unsafe, the spinlock algorithm eliminates the possibility of race
37
      SMP-unsafe, the spinlock algorithm eliminates the possibility of race
38
      conditions and is suitable for mutual exclusion use.</para>
38
      conditions and is suitable for mutual exclusion use.</para>
39
 
39
 
40
      <para>The semantics of the test-and-set operation is that the spinlock
40
      <para>The semantics of the test-and-set operation is that the spinlock
41
      remains unavailable until this operation called on the respective
41
      remains unavailable until this operation called on the respective
42
      spinlock returns zero. HelenOS builds two functions on top of the
42
      spinlock returns zero. HelenOS builds two functions on top of the
43
      test-and-set operation. The first function is the unconditional attempt
43
      test-and-set operation. The first function is the unconditional attempt
44
      to acquire the spinlock and is called <code>spinlock_lock</code>. It
44
      to acquire the spinlock and is called <code>spinlock_lock</code>. It
45
      simply loops until the test-and-set returns a zero value. The other
45
      simply loops until the test-and-set returns a zero value. The other
46
      function, <code>spinlock_trylock</code>, is the conditional lock
46
      function, <code>spinlock_trylock</code>, is the conditional lock
47
      operation and calls the test-and-set only once to find out whether it
47
      operation and calls the test-and-set only once to find out whether it
48
      managed to acquire the spinlock or not. The conditional operation is
48
      managed to acquire the spinlock or not. The conditional operation is
49
      useful in situations in which an algorithm cannot acquire more spinlocks
49
      useful in situations in which an algorithm cannot acquire more spinlocks
50
      in the proper order and a deadlock cannot be avoided. In such a case,
50
      in the proper order and a deadlock cannot be avoided. In such a case,
51
      the algorithm would detect the danger and instead of possibly
51
      the algorithm would detect the danger and instead of possibly
52
      deadlocking the system it would simply release some spinlocks it already
52
      deadlocking the system it would simply release some spinlocks it already
53
      holds and retry the whole operation with the hope that it will succeed
53
      holds and retry the whole operation with the hope that it will succeed
54
      next time. The unlock function, <code>spinlock_unlock</code>, is quite
54
      next time. The unlock function, <code>spinlock_unlock</code>, is quite
55
      easy - it merely clears the spinlock variable.</para>
55
      easy - it merely clears the spinlock variable.</para>
56
 
56
 
57
      <para>Nevertheless, there is a special issue related to hardware
57
      <para>Nevertheless, there is a special issue related to hardware
58
      optimizations that modern processors implement. Particularly problematic
58
      optimizations that modern processors implement. Particularly problematic
59
      is the out-of-order execution of instructions within the critical
59
      is the out-of-order execution of instructions within the critical
60
      section protected by a spinlock. The processors are always
60
      section protected by a spinlock. The processors are always
61
      self-consistent so that they can carry out speculatively executed
61
      self-consistent so that they can carry out speculatively executed
62
      instructions in the right order with regard to dependencies among those
62
      instructions in the right order with regard to dependencies among those
63
      instructions. However, the dependency between instructions inside the
63
      instructions. However, the dependency between instructions inside the
64
      critical section and those that implement locking and unlocking of the
64
      critical section and those that implement locking and unlocking of the
65
      respective spinlock is not implicit on some processor architectures. As
65
      respective spinlock is not implicit on some processor architectures. As
66
      a result, the processor needs to be explicitly told about each
66
      a result, the processor needs to be explicitly told about each
67
      occurrence of such a dependency. Therefore, HelenOS adds
67
      occurrence of such a dependency. Therefore, HelenOS adds
68
      architecture-specific hooks to all <code>spinlock_lock</code>,
68
      architecture-specific hooks to all <code>spinlock_lock</code>,
69
      <code>spinlock_trylock</code> and <code>spinlock_unlock</code> functions
69
      <code>spinlock_trylock</code> and <code>spinlock_unlock</code> functions
70
      to prevent the instructions inside the critical section from permeating
70
      to prevent the instructions inside the critical section from permeating
71
      out. On some architectures, these hooks can be void because the
71
      out. On some architectures, these hooks can be void because the
72
      dependencies are implicitly there because of the special properties of
72
      dependencies are implicitly there because of the special properties of
73
      locking and unlocking instructions. However, other architectures need to
73
      locking and unlocking instructions. However, other architectures need to
74
      instrument these hooks with different memory barriers, depending on what
74
      instrument these hooks with different memory barriers, depending on what
75
      operations could permeate out.</para>
75
      operations could permeate out.</para>
76
 
76
 
77
      <para>Spinlocks have one significant drawback: when held for longer time
77
      <para>Spinlocks have one significant drawback: when held for longer time
78
      periods, they harm both parallelism and concurrency. The processor
78
      periods, they harm both parallelism and concurrency. The processor
79
      executing <code>spinlock_lock</code> does not do any fruitful work and
79
      executing <code>spinlock_lock</code> does not do any fruitful work and
80
      is effectively halted until it can grab the lock and proceed.
80
      is effectively halted until it can grab the lock and proceed.
81
      Similarily, other execution flows cannot execute on the processor that
81
      Similarily, other execution flows cannot execute on the processor that
82
      holds the spinlock because the kernel disables preemption on that
82
      holds the spinlock because the kernel disables preemption on that
83
      processor when a spinlock is held. The reason behind disabling
83
      processor when a spinlock is held. The reason behind disabling
84
      preemption is priority inversion problem avoidance. For the same reason,
84
      preemption is priority inversion problem avoidance. For the same reason,
85
      threads are strongly discouraged from sleeping when they hold a
85
      threads are strongly discouraged from sleeping when they hold a
86
      spinlock.</para>
86
      spinlock.</para>
87
 
87
 
88
      <para>To summarize, spinlocks represent very simple and essential mutual
88
      <para>To summarize, spinlocks represent very simple and essential mutual
89
      exclusion primitive for SMP systems. On the other hand, spinlocks scale
89
      exclusion primitive for SMP systems. On the other hand, spinlocks scale
90
      poorly because of the active loop they are based on. Therefore,
90
      poorly because of the active loop they are based on. Therefore,
91
      spinlocks are used in HelenOS only for short-time mutual exclusion and
91
      spinlocks are used in HelenOS only for short-time mutual exclusion and
92
      in cases where the mutual exclusion is required out of thread context.
92
      in cases where the mutual exclusion is required out of thread context.
93
      Lastly, spinlocks are used in the construction of passive
93
      Lastly, spinlocks are used in the construction of passive
94
      synchronization primitives.</para>
94
      synchronization primitives.</para>
95
    </section>
95
    </section>
96
  </section>
96
  </section>
97
 
97
 
98
  <section>
98
  <section>
99
    <title>Passive kernel synchronization</title>
99
    <title>Passive kernel synchronization</title>
100
 
100
 
101
    <section>
101
    <section>
102
      <title>Wait queues</title>
102
      <title>Wait queues</title>
103
 
103
 
104
      <para>A wait queue is the basic passive synchronization primitive on
104
      <para>A wait queue is the basic passive synchronization primitive on
105
      which all other passive synchronization primitives are built. Simply
105
      which all other passive synchronization primitives are built. Simply
106
      put, it allows a thread to sleep until an event associated with the
106
      put, it allows a thread to sleep until an event associated with the
107
      particular wait queue occurs. Multiple threads are notified about
107
      particular wait queue occurs. Multiple threads are notified about
108
      incoming events in a first come, first served fashion. Moreover, should
108
      incoming events in a first come, first served fashion. Moreover, should
109
      the event come before any thread waits for it, it is recorded in the
109
      the event come before any thread waits for it, it is recorded in the
110
      wait queue as a missed wakeup and later forwarded to the first thread
110
      wait queue as a missed wakeup and later forwarded to the first thread
111
      that decides to wait in the queue. The inner structures of the wait
111
      that decides to wait in the queue. The inner structures of the wait
112
      queue are protected by a spinlock.</para>
112
      queue are protected by a spinlock.</para>
113
 
113
 
114
      <para>The thread that wants to wait for a wait queue event uses the
114
      <para>The thread that wants to wait for a wait queue event uses the
115
      <code>waitq_sleep_timeout</code> function. The algorithm then checks the
115
      <code>waitq_sleep_timeout</code> function. The algorithm then checks the
116
      wait queue's counter of missed wakeups and if there are any missed
116
      wait queue's counter of missed wakeups and if there are any missed
117
      wakeups, the call returns immediately. The call also returns immediately
117
      wakeups, the call returns immediately. The call also returns immediately
118
      if only a conditional wait was requested. Otherwise the thread is
118
      if only a conditional wait was requested. Otherwise the thread is
119
      enqueued in the wait queue's list of sleeping threads and its state is
119
      enqueued in the wait queue's list of sleeping threads and its state is
120
      changed to <constant>Sleeping</constant>. It then sleeps until one of
120
      changed to <constant>Sleeping</constant>. It then sleeps until one of
121
      the following events happens:</para>
121
      the following events happens:</para>
122
 
122
 
123
      <orderedlist>
123
      <orderedlist>
124
        <listitem>
124
        <listitem>
125
          <para>another thread calls <code>waitq_wakeup</code> and the thread
125
          <para>another thread calls <code>waitq_wakeup</code> and the thread
126
          is the first thread in the wait queue's list of sleeping
126
          is the first thread in the wait queue's list of sleeping
127
          threads;</para>
127
          threads;</para>
128
        </listitem>
128
        </listitem>
129
 
129
 
130
        <listitem>
130
        <listitem>
131
          <para>another thread calls <code>waitq_interrupt_sleep</code> on the
131
          <para>another thread calls <code>waitq_interrupt_sleep</code> on the
132
          sleeping thread;</para>
132
          sleeping thread;</para>
133
        </listitem>
133
        </listitem>
134
 
134
 
135
        <listitem>
135
        <listitem>
136
          <para>the sleep times out provided that none of the previous
136
          <para>the sleep times out provided that none of the previous
137
          occurred within a specified time limit; the limit can be
137
          occurred within a specified time limit; the limit can be
138
          infinity.</para>
138
          infinity.</para>
139
        </listitem>
139
        </listitem>
140
      </orderedlist>
140
      </orderedlist>
141
 
141
 
142
      <para>All five possibilities (immediate return on success, immediate
142
      <para>All five possibilities (immediate return on success, immediate
143
      return on failure, wakeup after sleep, interruption and timeout) are
143
      return on failure, wakeup after sleep, interruption and timeout) are
144
      distinguishable by the return value of <code>waitq_sleep_timeout</code>.
144
      distinguishable by the return value of <code>waitq_sleep_timeout</code>.
145
      Being able to interrupt a sleeping thread is essential for externally
145
      Being able to interrupt a sleeping thread is essential for externally
146
      initiated thread termination. The ability to wait only for a certain
146
      initiated thread termination. The ability to wait only for a certain
147
      amount of time is used, for instance, to passively delay thread
147
      amount of time is used, for instance, to passively delay thread
148
      execution by several microseconds or even seconds in
148
      execution by several microseconds or even seconds in
149
      <code>thread_sleep</code> function. Due to the fact that all other
149
      <code>thread_sleep</code> function. Due to the fact that all other
150
      passive kernel synchronization primitives are based on wait queues, they
150
      passive kernel synchronization primitives are based on wait queues, they
151
      also have the option of being interrutped and, more importantly, can
151
      also have the option of being interrutped and, more importantly, can
152
      timeout. All of them also implement the conditional operation.
152
      timeout. All of them also implement the conditional operation.
153
      Furthemore, this very fundamental interface reaches up to the
153
      Furthemore, this very fundamental interface reaches up to the
154
      implementation of futexes - userspace synchronization primitive, which
154
      implementation of futexes - userspace synchronization primitive, which
155
      makes it possible for a userspace thread to request a synchronization
155
      makes it possible for a userspace thread to request a synchronization
156
      operation with a timeout or a conditional operation.</para>
156
      operation with a timeout or a conditional operation.</para>
157
 
157
 
158
      <para>From the description above, it should be apparent, that when a
158
      <para>From the description above, it should be apparent, that when a
159
      sleeping thread is woken by <code>waitq_wakeup</code> or when
159
      sleeping thread is woken by <code>waitq_wakeup</code> or when
160
      <code>waitq_sleep_timeout</code> succeeds immediately, the thread can be
160
      <code>waitq_sleep_timeout</code> succeeds immediately, the thread can be
161
      sure that the event has occurred. The thread need not and should not
161
      sure that the event has occurred. The thread need not and should not
162
      verify this fact. This approach is called direct hand-off and is
162
      verify this fact. This approach is called direct hand-off and is
163
      characteristic for all passive HelenOS synchronization primitives, with
163
      characteristic for all passive HelenOS synchronization primitives, with
164
      the exception as described below.</para>
164
      the exception as described below.</para>
165
    </section>
165
    </section>
166
 
166
 
167
    <section>
167
    <section>
168
      <title>Semaphores</title>
168
      <title>Semaphores</title>
169
 
169
 
170
      <para>The interesting point about wait queues is that the number of
170
      <para>The interesting point about wait queues is that the number of
171
      missed wakeups is equal to the number of threads that will not block in
171
      missed wakeups is equal to the number of threads that will not block in
172
      <code>watiq_sleep_timeout</code> and would immediately succeed instead.
172
      <code>watiq_sleep_timeout</code> and would immediately succeed instead.
173
      On the other hand, semaphores are synchronization primitives that will
173
      On the other hand, semaphores are synchronization primitives that will
174
      let predefined amount of threads into their critical section and block
174
      let predefined amount of threads into their critical section and block
175
      any other threads above this count. However, these two cases are exactly
175
      any other threads above this count. However, these two cases are exactly
176
      the same. Semaphores in HelenOS are therefore implemented as wait queues
176
      the same. Semaphores in HelenOS are therefore implemented as wait queues
177
      with a single semantic change: their wait queue is initialized to have
177
      with a single semantic change: their wait queue is initialized to have
178
      so many missed wakeups as is the number of threads that the semphore
178
      so many missed wakeups as is the number of threads that the semphore
179
      intends to let into its critical section simultaneously.</para>
179
      intends to let into its critical section simultaneously.</para>
180
 
180
 
181
      <para>In the semaphore language, the wait queue operation
181
      <para>In the semaphore language, the wait queue operation
182
      <code>waitq_sleep_timeout</code> corresponds to semaphore
182
      <code>waitq_sleep_timeout</code> corresponds to semaphore
183
      <code>down</code> operation, represented by the function
183
      <code>down</code> operation, represented by the function
184
      <code>semaphore_down_timeout</code> and by way of similitude the wait
184
      <code>semaphore_down_timeout</code> and by way of similitude the wait
185
      queue operation waitq_wakeup corresponds to semaphore <code>up</code>
185
      queue operation waitq_wakeup corresponds to semaphore <code>up</code>
186
      operation, represented by the function <code>sempafore_up</code>. The
186
      operation, represented by the function <code>sempafore_up</code>. The
187
      conditional down operation is called
187
      conditional down operation is called
188
      <code>semaphore_trydown</code>.</para>
188
      <code>semaphore_trydown</code>.</para>
189
    </section>
189
    </section>
190
 
190
 
191
    <section>
191
    <section>
192
      <title>Mutexes</title>
192
      <title>Mutexes</title>
193
 
193
 
194
      <para>Mutexes are sometimes referred to as binary sempahores. That means
194
      <para>Mutexes are sometimes referred to as binary sempahores. That means
195
      that mutexes are like semaphores that allow only one thread in its
195
      that mutexes are like semaphores that allow only one thread in its
196
      critical section. Indeed, mutexes in HelenOS are implemented exactly in
196
      critical section. Indeed, mutexes in HelenOS are implemented exactly in
197
      this way: they are built on top of semaphores. From another point of
197
      this way: they are built on top of semaphores. From another point of
198
      view, they can be viewed as spinlocks without busy waiting. Their
198
      view, they can be viewed as spinlocks without busy waiting. Their
199
      semaphore heritage provides good basics for both conditional operation
199
      semaphore heritage provides good basics for both conditional operation
200
      and operation with timeout. The locking operation is called
200
      and operation with timeout. The locking operation is called
201
      <code>mutex_lock</code>, the conditional locking operation is called
201
      <code>mutex_lock</code>, the conditional locking operation is called
202
      <code>mutex_trylock</code> and the unlocking operation is called
202
      <code>mutex_trylock</code> and the unlocking operation is called
203
      <code>mutex_unlock</code>.</para>
203
      <code>mutex_unlock</code>.</para>
204
    </section>
204
    </section>
205
 
205
 
206
    <section>
206
    <section>
207
      <title>Reader/writer locks</title>
207
      <title>Reader/writer locks</title>
208
 
208
 
209
      <para>Reader/writer locks, or rwlocks, are by far the most complicated
209
      <para>Reader/writer locks, or rwlocks, are by far the most complicated
210
      synchronization primitive within the kernel. The goal of these locks is
210
      synchronization primitive within the kernel. The goal of these locks is
211
      to improve concurrency of applications, in which threads need to
211
      to improve concurrency of applications, in which threads need to
212
      synchronize access to a shared resource, and that access can be
212
      synchronize access to a shared resource, and that access can be
213
      partitioned into a read-only mode and a write mode. Reader/writer locks
213
      partitioned into a read-only mode and a write mode. Reader/writer locks
214
      should make it possible for several, possibly many, readers to enter the
214
      should make it possible for several, possibly many, readers to enter the
215
      critical section, provided that no writer is currently in the critical
215
      critical section, provided that no writer is currently in the critical
216
      section, or to be in the critical section contemporarily. Writers are
216
      section, or to be in the critical section contemporarily. Writers are
217
      allowed to enter the critical section only individually, provided that
217
      allowed to enter the critical section only individually, provided that
218
      no reader is in the critical section already. Applications, in which the
218
      no reader is in the critical section already. Applications, in which the
219
      majority of operations can be done in the read-only mode, can benefit
219
      majority of operations can be done in the read-only mode, can benefit
220
      from increased concurrency created by reader/writer locks.</para>
220
      from increased concurrency created by reader/writer locks.</para>
221
 
221
 
222
      <para>During reader/writer lock construction, a decision should be made
222
      <para>During reader/writer lock construction, a decision should be made
223
      whether readers will be prefered over writers or whether writers will be
223
      whether readers will be prefered over writers or whether writers will be
224
      prefered over readers in cases when the lock is not currently held and
224
      prefered over readers in cases when the lock is not currently held and
225
      both a reader and a writer want to gain the lock. Some operating systems
225
      both a reader and a writer want to gain the lock. Some operating systems
226
      prefer one group over the other, creating thus a possibility for
226
      prefer one group over the other, creating thus a possibility for
227
      starving the unprefered group. In the HelenOS operating system, none of
227
      starving the unprefered group. In the HelenOS operating system, none of
228
      the two groups is prefered. The lock is granted on a first come, first
228
      the two groups is prefered. The lock is granted on a first come, first
229
      served basis with the additional note that readers are granted the lock
229
      served basis with the additional note that readers are granted the lock
230
      in the biggest possible batch.</para>
230
      in the biggest possible batch.</para>
231
 
231
 
232
      <para>With this policy and the timeout modes of operation, the direct
232
      <para>With this policy and the timeout modes of operation, the direct
233
      hand-off becomes much more complicated. For instance, a writer leaving
233
      hand-off becomes much more complicated. For instance, a writer leaving
234
      the critical section must wake up all leading readers in the rwlock's
234
      the critical section must wake up all leading readers in the rwlock's
235
      wait queue or one leading writer or no-one if no thread is waiting.
235
      wait queue or one leading writer or no-one if no thread is waiting.
236
      Similarily, the last reader leaving the critical section must wakeup the
236
      Similarily, the last reader leaving the critical section must wakeup the
237
      sleeping writer if there are any sleeping threads left at all. As
237
      sleeping writer if there are any sleeping threads left at all. As
238
      another example, if a writer at the beginning of the rwlock's wait queue
238
      another example, if a writer at the beginning of the rwlock's wait queue
239
      times out and the lock is held by at least one reader, the writer which
239
      times out and the lock is held by at least one reader, the writer which
240
      has timed out must first wake up all readers that follow him in the
240
      has timed out must first wake up all readers that follow him in the
241
      queue prior to signalling the timeout itself and giving up.</para>
241
      queue prior to signalling the timeout itself and giving up.</para>
242
 
242
 
243
      <para>Due to the issues mentioned in the previous paragraph, the
243
      <para>Due to the issues mentioned in the previous paragraph, the
244
      reader/writer lock imlpementation needs to walk the rwlock wait queue's
244
      reader/writer lock imlpementation needs to walk the rwlock wait queue's
245
      list of sleeping threads directly, in order to find out the type of
245
      list of sleeping threads directly, in order to find out the type of
246
      access that the queueing threads demand. This makes the code difficult
246
      access that the queueing threads demand. This makes the code difficult
247
      to understand and dependent on the internal implementation of the wait
247
      to understand and dependent on the internal implementation of the wait
248
      queue. Nevertheless, it remains unclear to the authors of HelenOS
248
      queue. Nevertheless, it remains unclear to the authors of HelenOS
249
      whether a simpler but equivalently fair solution exists.</para>
249
      whether a simpler but equivalently fair solution exists.</para>
250
 
250
 
251
      <para>The implementation of rwlocks as it has been already put, makes
251
      <para>The implementation of rwlocks as it has been already put, makes
252
      use of one single wait queue for both readers and writers, thus avoiding
252
      use of one single wait queue for both readers and writers, thus avoiding
253
      any possibility of starvation. In fact, rwlocks use a mutex rather than
253
      any possibility of starvation. In fact, rwlocks use a mutex rather than
254
      a bare wait queue. This mutex is called <code>exclusive</code> and is
254
      a bare wait queue. This mutex is called <code>exclusive</code> and is
255
      used to synchronize writers. The writer's lock operation,
255
      used to synchronize writers. The writer's lock operation,
256
      <code>rwlock_write_lock_timeout</code>, simply tries to acquire the
256
      <code>rwlock_write_lock_timeout</code>, simply tries to acquire the
257
      exclusive mutex. If it succeeds, the writer is granted the rwlock.
257
      exclusive mutex. If it succeeds, the writer is granted the rwlock.
258
      However, if the operation fails (e.g. times out), the writer must check
258
      However, if the operation fails (e.g. times out), the writer must check
259
      for potential readers at the head of the list of sleeping threads
259
      for potential readers at the head of the list of sleeping threads
260
      associated with the mutex's wait queue and then proceed according to the
260
      associated with the mutex's wait queue and then proceed according to the
261
      procedure outlined above.</para>
261
      procedure outlined above.</para>
262
 
262
 
263
      <para>The exclusive mutex plays an important role in reader
263
      <para>The exclusive mutex plays an important role in reader
264
      synchronization as well. However, a reader doing the reader's lock
264
      synchronization as well. However, a reader doing the reader's lock
265
      operation, <code>rwlock_read_lock_timeout</code>, may bypass this mutex
265
      operation, <code>rwlock_read_lock_timeout</code>, may bypass this mutex
266
      when it detects that:</para>
266
      when it detects that:</para>
267
 
267
 
268
      <orderedlist>
268
      <orderedlist>
269
        <listitem>
269
        <listitem>
270
          <para>there are other readers in the critical section and</para>
270
          <para>there are other readers in the critical section and</para>
271
        </listitem>
271
        </listitem>
272
 
272
 
273
        <listitem>
273
        <listitem>
274
          <para>there are no sleeping threads waiting for the exclusive
274
          <para>there are no sleeping threads waiting for the exclusive
275
          mutex.</para>
275
          mutex.</para>
276
        </listitem>
276
        </listitem>
277
      </orderedlist>
277
      </orderedlist>
278
 
278
 
279
      <para>If both conditions are true, the reader will bypass the mutex,
279
      <para>If both conditions are true, the reader will bypass the mutex,
280
      increment the number of readers in the critical section and then enter
280
      increment the number of readers in the critical section and then enter
281
      the critical section. Note that if there are any sleeping threads at the
281
      the critical section. Note that if there are any sleeping threads at the
282
      beginning of the wait queue, the first must be a writer. If the
282
      beginning of the wait queue, the first must be a writer. If the
283
      conditions are not fulfilled, the reader normally waits until the
283
      conditions are not fulfilled, the reader normally waits until the
284
      exclusive mutex is granted to it.</para>
284
      exclusive mutex is granted to it.</para>
285
    </section>
285
    </section>
286
 
286
 
287
    <section>
287
    <section>
288
      <title>Condition variables</title>
288
      <title>Condition variables</title>
289
 
289
 
290
      <para>Condition variables can be used for waiting until a condition
290
      <para>Condition variables can be used for waiting until a condition
291
      becomes true. In this respect, they are similar to wait queues. But
291
      becomes true. In this respect, they are similar to wait queues. But
292
      contrary to wait queues, condition variables have different semantics
292
      contrary to wait queues, condition variables have different semantics
293
      that allows events to be lost when there is no thread waiting for them.
293
      that allows events to be lost when there is no thread waiting for them.
294
      In order to support this, condition variables don't use direct hand-off
294
      In order to support this, condition variables don't use direct hand-off
295
      and operate in a way similar to the example below. A thread waiting for
295
      and operate in a way similar to the example below. A thread waiting for
296
      the condition becoming true does the following:</para>
296
      the condition becoming true does the following:</para>
297
 
297
 
-
 
298
      <example>
-
 
299
      <title>Use of <code>condvar_wait_timeout</code>.</title>
298
      <para><programlisting language="C"><function>mutex_lock</function>(<varname>mtx</varname>);
300
      <programlisting language="C"><function>mutex_lock</function>(<varname>mtx</varname>);
299
while (!<varname>condition</varname>)
301
while (!<varname>condition</varname>)
300
        <function>condvar_wait_timeout</function>(<varname>cv</varname>, <varname>mtx</varname>);
302
        <function>condvar_wait_timeout</function>(<varname>cv</varname>, <varname>mtx</varname>);
301
/* <remark>the condition is true, do something</remark> */
303
/* <remark>the condition is true, do something</remark> */
302
<function>mutex_unlock</function>(<varname>mtx</varname>);</programlisting></para>
304
<function>mutex_unlock</function>(<varname>mtx</varname>);</programlisting>
-
 
305
    </example>
303
 
306
 
304
      <para>A thread that causes the condition become true signals this event
307
      <para>A thread that causes the condition become true signals this event
305
      like this:</para>
308
      like this:</para>
306
 
309
 
-
 
310
      <example>
-
 
311
      <title>Use of <code>condvar_signal</code>.</title>
307
      <para><programlisting><function>mutex_lock</function>(<varname>mtx</varname>);
312
      <programlisting><function>mutex_lock</function>(<varname>mtx</varname>);
308
<varname>condition</varname> = <constant>true</constant>;
313
<varname>condition</varname> = <constant>true</constant>;
309
<function>condvar_signal</function>(<varname>cv</varname>);  /* <remark>condvar_broadcast(cv);</remark> */
314
<function>condvar_signal</function>(<varname>cv</varname>);  /* <remark>condvar_broadcast(cv);</remark> */
310
<function>mutex_unlock</function>(<varname>mtx</varname>);</programlisting></para>
315
<function>mutex_unlock</function>(<varname>mtx</varname>);</programlisting></example>
311
 
316
 
312
      <para>The wait operation, <code>condvar_wait_timeout</code>, always puts
317
      <para>The wait operation, <code>condvar_wait_timeout</code>, always puts
313
      the calling thread to sleep. The thread then sleeps until another thread
318
      the calling thread to sleep. The thread then sleeps until another thread
314
      invokes <code>condvar_broadcast</code> on the same condition variable or
319
      invokes <code>condvar_broadcast</code> on the same condition variable or
315
      until it is woken up by <code>condvar_signal</code>. The
320
      until it is woken up by <code>condvar_signal</code>. The
316
      <code>condvar_signal</code> operation unblocks the first thread blocking
321
      <code>condvar_signal</code> operation unblocks the first thread blocking
317
      on the condition variable while the <code>condvar_broadcast</code>
322
      on the condition variable while the <code>condvar_broadcast</code>
318
      operation unblocks all threads blocking there. If there are no blocking
323
      operation unblocks all threads blocking there. If there are no blocking
319
      threads, these two operations have no efect.</para>
324
      threads, these two operations have no efect.</para>
320
 
325
 
321
      <para>Note that the threads must synchronize over a dedicated mutex. To
326
      <para>Note that the threads must synchronize over a dedicated mutex. To
322
      prevent race condition between <code>condvar_wait_timeout</code> and
327
      prevent race condition between <code>condvar_wait_timeout</code> and
323
      <code>condvar_signal</code> or <code>condvar_broadcast</code>, the mutex
328
      <code>condvar_signal</code> or <code>condvar_broadcast</code>, the mutex
324
      is passed to <code>condvar_wait_timeout</code> which then atomically
329
      is passed to <code>condvar_wait_timeout</code> which then atomically
325
      puts the calling thread asleep and unlocks the mutex. When the thread
330
      puts the calling thread asleep and unlocks the mutex. When the thread
326
      eventually wakes up, <code>condvar_wait</code> regains the mutex and
331
      eventually wakes up, <code>condvar_wait</code> regains the mutex and
327
      returns.</para>
332
      returns.</para>
328
 
333
 
329
      <para>Also note, that there is no conditional operation for condition
334
      <para>Also note, that there is no conditional operation for condition
330
      variables. Such an operation would make no sence since condition
335
      variables. Such an operation would make no sence since condition
331
      variables are defined to forget events for which there is no waiting
336
      variables are defined to forget events for which there is no waiting
332
      thread and because <code>condvar_wait</code> must always go to sleep.
337
      thread and because <code>condvar_wait</code> must always go to sleep.
333
      The operation with timeout is supported as usually.</para>
338
      The operation with timeout is supported as usually.</para>
334
 
339
 
335
      <para>In HelenOS, condition variables are based on wait queues. As it is
340
      <para>In HelenOS, condition variables are based on wait queues. As it is
336
      already mentioned above, wait queues remember missed events while
341
      already mentioned above, wait queues remember missed events while
337
      condition variables must not do so. This is reasoned by the fact that
342
      condition variables must not do so. This is reasoned by the fact that
338
      condition variables are designed for scenarios in which an event might
343
      condition variables are designed for scenarios in which an event might
339
      occur very many times without being picked up by any waiting thread. On
344
      occur very many times without being picked up by any waiting thread. On
340
      the other hand, wait queues would remember any event that had not been
345
      the other hand, wait queues would remember any event that had not been
341
      picked up by a call to <code>waitq_sleep_timeout</code>. Therefore, if
346
      picked up by a call to <code>waitq_sleep_timeout</code>. Therefore, if
342
      wait queues were used directly and without any changes to implement
347
      wait queues were used directly and without any changes to implement
343
      condition variables, the missed_wakeup counter would hurt performance of
348
      condition variables, the missed_wakeup counter would hurt performance of
344
      the implementation: the <code>while</code> loop in
349
      the implementation: the <code>while</code> loop in
345
      <code>condvar_wait_timeout</code> would effectively do busy waiting
350
      <code>condvar_wait_timeout</code> would effectively do busy waiting
346
      until all missed wakeups were discarded.</para>
351
      until all missed wakeups were discarded.</para>
347
 
352
 
348
      <para>The requirement on the wait operation to atomically put the caller
353
      <para>The requirement on the wait operation to atomically put the caller
349
      to sleep and release the mutex poses an interesting problem on
354
      to sleep and release the mutex poses an interesting problem on
350
      <code>condvar_wait_timeout</code>. More precisely, the thread should
355
      <code>condvar_wait_timeout</code>. More precisely, the thread should
351
      sleep in the condvar's wait queue prior to releasing the mutex, but it
356
      sleep in the condvar's wait queue prior to releasing the mutex, but it
352
      must not hold the mutex when it is sleeping.</para>
357
      must not hold the mutex when it is sleeping.</para>
353
 
358
 
354
      <para>Problems described in the two previous paragraphs are addressed in
359
      <para>Problems described in the two previous paragraphs are addressed in
355
      HelenOS by dividing the <code>waitq_sleep_timeout</code> function into
360
      HelenOS by dividing the <code>waitq_sleep_timeout</code> function into
356
      three pieces:</para>
361
      three pieces:</para>
357
 
362
 
358
      <orderedlist>
363
      <orderedlist>
359
        <listitem>
364
        <listitem>
360
          <para><code>waitq_sleep_prepare</code> prepares the thread to go to
365
          <para><code>waitq_sleep_prepare</code> prepares the thread to go to
361
          sleep by, among other things, locking the wait queue;</para>
366
          sleep by, among other things, locking the wait queue;</para>
362
        </listitem>
367
        </listitem>
363
 
368
 
364
        <listitem>
369
        <listitem>
365
          <para><code>waitq_sleep_timeout_unsafe</code> implements the core
370
          <para><code>waitq_sleep_timeout_unsafe</code> implements the core
366
          blocking logic;</para>
371
          blocking logic;</para>
367
        </listitem>
372
        </listitem>
368
 
373
 
369
        <listitem>
374
        <listitem>
370
          <para><code>waitq_sleep_finish</code> performs cleanup after
375
          <para><code>waitq_sleep_finish</code> performs cleanup after
371
          <code>waitq_sleep_timeout_unsafe</code>; after this call, the wait
376
          <code>waitq_sleep_timeout_unsafe</code>; after this call, the wait
372
          queue spinlock is guaranteed to be unlocked by the caller</para>
377
          queue spinlock is guaranteed to be unlocked by the caller</para>
373
        </listitem>
378
        </listitem>
374
      </orderedlist>
379
      </orderedlist>
375
 
380
 
376
      <para>The stock <code>waitq_sleep_timeout</code> is then a mere wrapper
381
      <para>The stock <code>waitq_sleep_timeout</code> is then a mere wrapper
377
      that calls these three functions. It is provided for convenience in
382
      that calls these three functions. It is provided for convenience in
378
      cases where the caller doesn't require such a low level control.
383
      cases where the caller doesn't require such a low level control.
379
      However, the implementation of <code>condvar_wait_timeout</code> does
384
      However, the implementation of <code>condvar_wait_timeout</code> does
380
      need this finer-grained control because it has to interleave calls to
385
      need this finer-grained control because it has to interleave calls to
381
      these functions by other actions. It carries its operations out in the
386
      these functions by other actions. It carries its operations out in the
382
      following order:</para>
387
      following order:</para>
383
 
388
 
384
      <orderedlist>
389
      <orderedlist>
385
        <listitem>
390
        <listitem>
386
          <para>calls <code>waitq_sleep_prepare</code> in order to lock the
391
          <para>calls <code>waitq_sleep_prepare</code> in order to lock the
387
          condition variable's wait queue,</para>
392
          condition variable's wait queue,</para>
388
        </listitem>
393
        </listitem>
389
 
394
 
390
        <listitem>
395
        <listitem>
391
          <para>releases the mutex,</para>
396
          <para>releases the mutex,</para>
392
        </listitem>
397
        </listitem>
393
 
398
 
394
        <listitem>
399
        <listitem>
395
          <para>clears the counter of missed wakeups,</para>
400
          <para>clears the counter of missed wakeups,</para>
396
        </listitem>
401
        </listitem>
397
 
402
 
398
        <listitem>
403
        <listitem>
399
          <para>calls <code>waitq_sleep_timeout_unsafe</code>,</para>
404
          <para>calls <code>waitq_sleep_timeout_unsafe</code>,</para>
400
        </listitem>
405
        </listitem>
401
 
406
 
402
        <listitem>
407
        <listitem>
403
          <para>retakes the mutex,</para>
408
          <para>retakes the mutex,</para>
404
        </listitem>
409
        </listitem>
405
 
410
 
406
        <listitem>
411
        <listitem>
407
          <para>calls <code>waitq_sleep_finish</code>.</para>
412
          <para>calls <code>waitq_sleep_finish</code>.</para>
408
        </listitem>
413
        </listitem>
409
      </orderedlist>
414
      </orderedlist>
410
    </section>
415
    </section>
411
  </section>
416
  </section>
412
 
417
 
413
  <section>
418
  <section>
414
    <title>Userspace synchronization</title>
419
    <title>Userspace synchronization</title>
415
 
420
 
416
    <section>
421
    <section>
417
      <title>Futexes</title>
422
      <title>Futexes</title>
418
 
423
 
419
      <para></para>
424
      <para></para>
420
    </section>
425
    </section>
421
  </section>
426
  </section>
422
</chapter>
427
</chapter>