Subversion Repositories HelenOS-doc

Rev

Rev 41 | Rev 44 | Go to most recent revision | Show entire file | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed

Rev 41 Rev 43
Line 35... Line 35...
35
      atomic operation and the spinlock algortihm. While the naive algorithm
35
      atomic operation and the spinlock algortihm. While the naive algorithm
36
      is prone to race conditions on SMP configuratinos and thus is completely
36
      is prone to race conditions on SMP configuratinos and thus is completely
37
      SMP-unsafe, the spinlock algorithm eliminates the possibility of race
37
      SMP-unsafe, the spinlock algorithm eliminates the possibility of race
38
      conditions and is suitable for mutual exclusion use.</para>
38
      conditions and is suitable for mutual exclusion use.</para>
39
 
39
 
40
      <para></para>
-
 
41
 
-
 
42
      <para>The semantics of the test-and-set operation is that the spinlock
40
      <para>The semantics of the test-and-set operation is that the spinlock
43
      remains unavailable until this operation called on the respective
41
      remains unavailable until this operation called on the respective
44
      spinlock returns zero. HelenOS builds two functions on top of
42
      spinlock returns zero. HelenOS builds two functions on top of
45
      test-and-set operation. The first is the unconditional attempt to
43
      test-and-set operation. The first is the unconditional attempt to
46
      acquire the spinlock and is called <emphasis>spinlock_lock</emphasis>.
44
      acquire the spinlock and is called <emphasis>spinlock_lock</emphasis>.
Line 54... Line 52...
54
      the system it would simply release some spinlocks it already holds and
52
      the system it would simply release some spinlocks it already holds and
55
      retry the whole operation with the hope that it will succeed next time.
53
      retry the whole operation with the hope that it will succeed next time.
56
      The unlock operation, <emphasis>spinlock_unlock</emphasis>, is quite
54
      The unlock operation, <emphasis>spinlock_unlock</emphasis>, is quite
57
      easy - it merely clears the spinlock variable.</para>
55
      easy - it merely clears the spinlock variable.</para>
58
 
56
 
59
      <para></para>
-
 
60
 
-
 
61
      <para>Nevertheless, there is a special issue related to hardware
57
      <para>Nevertheless, there is a special issue related to hardware
62
      optimizations that modern processors implement. Particularily
58
      optimizations that modern processors implement. Particularily
63
      problematic is the out-of-order execution of instructions within the
59
      problematic is the out-of-order execution of instructions within the
64
      critical section protected by a spinlock. The processors are always
60
      critical section protected by a spinlock. The processors are always
65
      self-consistent so that they can carry out speculatively executed
61
      self-consistent so that they can carry out speculatively executed
Line 77... Line 73...
77
      because of the special properties of locking and unlocking instructions.
73
      because of the special properties of locking and unlocking instructions.
78
      However, other architectures need to instrument these hooks with
74
      However, other architectures need to instrument these hooks with
79
      different memory barriers, depending on what operations can bleed
75
      different memory barriers, depending on what operations can bleed
80
      out.</para>
76
      out.</para>
81
 
77
 
82
      <para></para>
-
 
83
 
-
 
84
      <para>Spinlocks have one significant drawback: when held for longer time
78
      <para>Spinlocks have one significant drawback: when held for longer time
85
      periods, they harm both parallelism and concurrency. Processor executing
79
      periods, they harm both parallelism and concurrency. Processor executing
86
      <emphasis>spinlock_lock</emphasis> does not do any fruitful work and is
80
      <emphasis>spinlock_lock</emphasis> does not do any fruitful work and is
87
      effectively halted until it can grab the lock and proceed. Similarily,
81
      effectively halted until it can grab the lock and proceed. Similarily,
88
      other threads cannot execute on the processor that holds the spinlock
82
      other threads cannot execute on the processor that holds the spinlock
89
      because the kernel disables preemption on that processor when a spinlock
83
      because the kernel disables preemption on that processor when a spinlock
90
      is held. The reason behind disabling preemption is priority inversion
84
      is held. The reason behind disabling preemption is priority inversion
91
      problem avoidance. For the same reason, threads are strongly discouraged
85
      problem avoidance. For the same reason, threads are strongly discouraged
92
      from sleeping when they hold a spinlock.</para>
86
      from sleeping when they hold a spinlock.</para>
93
 
87
 
94
      <para></para>
-
 
95
 
-
 
96
      <para>To summarize, spinlocks represent very simple and essential mutual
88
      <para>To summarize, spinlocks represent very simple and essential mutual
97
      exclusion primitive for SMP systems. On the other hand, spinlocks scale
89
      exclusion primitive for SMP systems. On the other hand, spinlocks scale
98
      poorly because of the active loop they are based on. Therefore,
90
      poorly because of the active loop they are based on. Therefore,
99
      spinlocks are used in HelenOS only for a short-time mutual exclusion and
91
      spinlocks are used in HelenOS only for a short-time mutual exclusion and
100
      in cases where the mutual exclusion is required out of thread context.
92
      in cases where the mutual exclusion is required out of thread context.
Line 105... Line 97...
105
 
97
 
106
  <section>
98
  <section>
107
    <title>Passive kernel synchronization</title>
99
    <title>Passive kernel synchronization</title>
108
 
100
 
109
    <section>
101
    <section>
110
      <title>Mutexes</title>
102
      <title>Wait queues</title>
111
 
103
 
-
 
104
      <para>A wait queue is the basic passive synchronization primitive on
-
 
105
      which all other passive synchronization primitives build. Simply put, it
-
 
106
      allows a thread to sleep until an event associated with the particular
-
 
107
      wait queue occurs. Multiple threads are notified about incoming events
-
 
108
      in first come, first served fashion. Moreover, should the event come
-
 
109
      before any thread waits for it, it is recorded in the wait queue as a
-
 
110
      missed wakeup and later forwarded to the first thread that decides to
-
 
111
      wait in the queue. The inner structures of the wait queue are protected
112
      <para>Mutex explanations</para>
112
      by a spinlock.</para>
-
 
113
 
-
 
114
      <para>The thread that wants to wait for a wait queue event uses the
-
 
115
      <emphasis>waitq_sleep_timeout</emphasis> function. The algorithm then
-
 
116
      checks the wait queue's counter of missed wakeups and if there are any
-
 
117
      missed wakeups, the call returns immediately. The call also returns
-
 
118
      immediately if only a conditional wait was requested. Otherwise the
-
 
119
      thread is enqueued in the wait queue's list of sleeping threads and its
-
 
120
      state is changed to <emphasis>Sleeping</emphasis>. It then sleeps until
-
 
121
      one of the following events happens:</para>
-
 
122
 
-
 
123
      <orderedlist>
-
 
124
        <listitem>
-
 
125
          <para>another thread calls <emphasis>waitq_wakeup</emphasis> and the
-
 
126
          thread is the first thread in the wait queue's list of sleeping
-
 
127
          threads</para>
-
 
128
        </listitem>
-
 
129
 
-
 
130
        <listitem>
-
 
131
          <para>another thread calls
-
 
132
          <emphasis>waitq_interrupt_sleep</emphasis> on the sleeping
-
 
133
          thread</para>
-
 
134
        </listitem>
-
 
135
 
-
 
136
        <listitem>
-
 
137
          <para>the sleep timeouts provided that none of the previous occurred
-
 
138
          within a specified time limit; the limit can be infinity</para>
-
 
139
        </listitem>
-
 
140
      </orderedlist>
-
 
141
 
-
 
142
      <para>All five possibilities (immediate return on success, immediate
-
 
143
      return on failure, wakeup after sleep, interruption and timeout) are
-
 
144
      distinguishable by the return value of
-
 
145
      <emphasis>waitq_sleep_timeout</emphasis>. The ability to interrupt a
-
 
146
      sleeping thread is essential for externally initiated thread termination
-
 
147
      and the ability to wait only for a certain amount of time is used, for
-
 
148
      instance, to passively delay thread execution by several microseconds or
-
 
149
      even seconds in <emphasis>thread_sleep</emphasis> function. Because all
-
 
150
      other passive kernel synchronization primitives are based on wait
-
 
151
      queues, they also have the option of being interrutped and, more
-
 
152
      importantly, can timeout. All of them also implement the conditional
-
 
153
      operation. Furthemore, this very fundamental interface reaches up to the
-
 
154
      implementation of futexes - userspace synchronization primitive, which
-
 
155
      makes it possible for a userspace thread to request synchronization
-
 
156
      operation with a timeout or a conditional operation.</para>
-
 
157
 
-
 
158
      <para>From the description above, it should be apparent, that when a
-
 
159
      sleeping thread is woken by <emphasis>waitq_wakeup</emphasis> or when
-
 
160
      <emphasis>waitq_sleep_timeout</emphasis> succeeds immediatelly, the
-
 
161
      thread can be sure the event has come and the thread need not and should
-
 
162
      not verify this fact. This approach is called direct hand-off and is
-
 
163
      characteristic for all passive HelenOS synchronization primitives with
-
 
164
      one exception described below.</para>
113
    </section>
165
    </section>
114
 
166
 
115
    <section>
167
    <section>
116
      <title>Semaphores</title>
168
      <title>Semaphores</title>
117
 
169
 
-
 
170
      <para>The interesting point about wait queues is that the number of
-
 
171
      missed wakeups is equal to the number of threads that will not block in
-
 
172
      <emphasis>watiq_sleep_timeout</emphasis> and would immediately succeed
-
 
173
      instead. On the other hand, semaphores are synchronization primitives
-
 
174
      that will let predefined amount of threads in its critical section and
-
 
175
      block any other threads above this count. However, these two cases are
-
 
176
      exactly the same. Semaphores in HelenOS are therefore implemented as
-
 
177
      wait queues with a single semantic change: their wait queue is
-
 
178
      initialized to have so many missed wakeups as is the number of threads
-
 
179
      that the semphore intends to let into its critical section
-
 
180
      simultaneously.</para>
-
 
181
 
118
      <para>Semaphore explanations</para>
182
      <para>In the semaphore language, the wait queue operation
-
 
183
      <emphasis>waitq_sleep_timeout</emphasis> corresponds to
-
 
184
      <emphasis><emphasis>semaphore</emphasis> down</emphasis> operation,
-
 
185
      represented by the function <emphasis>semaphore_down_timeout</emphasis>
-
 
186
      and by way of similitude the wait queue operation waitq_wakeup
-
 
187
      corresponds to semaphore <emphasis>up</emphasis> operation, represented
-
 
188
      by the function <emphasis>sempafore_up</emphasis>. The conditional down
-
 
189
      operation is called <emphasis>semaphore_trydown</emphasis>.</para>
119
    </section>
190
    </section>
120
 
191
 
121
    <section>
192
    <section>
122
      <title>Read/Write Locks</title>
193
      <title>Mutexes</title>
123
 
194
 
-
 
195
      <para>Mutexes are are sometimes referred to as binary sempahores. That
-
 
196
      means that mutexes are like semaphores that allow only one thread in its
-
 
197
      critical section. Indeed, mutexes in HelenOS are implemented exactly in
-
 
198
      this way: they are built atop semaphores. From another point of view,
-
 
199
      they can be viewed as spinlocks without busy waiting. Their semaphore
-
 
200
      heritage provides good basics for both conditional operation and
-
 
201
      operation with timeout. The locking operation is called
-
 
202
      <emphasis>mutex_lock</emphasis>, the conditional locking operation is
-
 
203
      called <emphasis>mutex_trylock</emphasis> and the unlocking operation is
124
      <para>RWLocks explanation</para>
204
      called <emphasis>mutex_unlock</emphasis>.</para>
125
    </section>
205
    </section>
126
 
206
 
127
    <section>
207
    <section>
128
      <title>Wait queues</title>
208
      <title>Reader/writer locks</title>
129
 
209
 
-
 
210
      <para>Reader/writer locks, or rwlocks, are by far the most complicated
-
 
211
      synchronization primitive within the kernel. The goal of these locks is
-
 
212
      to improve concurrency of applications in which threads need to
-
 
213
      synchronize access to a shared resource and that access can be
-
 
214
      partitioned into a read-only mode and a write mode. Reader/writer locks
-
 
215
      should make it possible for several, possibly many, readers to enter the
-
 
216
      critical section, provided that no writer is currently in the critical
-
 
217
      section, or to be in the critical section contemporarily. Writers are
-
 
218
      allowed to enter the critical section only individually, provided that
-
 
219
      no reader is in the critical section already. Applications in which the
-
 
220
      majority of operations can be done in the read-only mode can benefit
-
 
221
      from increased concurrency created by reader/writer locks.</para>
-
 
222
 
-
 
223
      <para>During reader/writer locks construction, a decision should be made
-
 
224
      whether readers will be prefered over writers or whether writers will be
-
 
225
      prefered over readers in cases when the lock is not currently held and
-
 
226
      both a reader and a writer want to gain the lock. Some operating systems
-
 
227
      prefer one group over the other, creating thus a possibility for
-
 
228
      starving the unprefered group. In the HelenOS operating system, none of
-
 
229
      the two groups is prefered. The lock is granted on the first come, first
-
 
230
      served basis with the additional note that readers are granted the lock
130
      <para>Wait queue explanation</para>
231
      in biggest possible batches.</para>
-
 
232
 
-
 
233
      <para>With this policy and the timeout modes of operation, the direct
-
 
234
      hand-off becomes much more complicated. For instance, a writer leaving
-
 
235
      the critical section must wake up all leading readers in the rwlock's
-
 
236
      wait queue or one leading writer or no-one if no thread is waiting.
-
 
237
      Similarily, the last reader leaving the critical section must wakeup the
-
 
238
      sleeping writer, if there are any sleeping threads at all. As another
-
 
239
      example, if a writer at the beginning of the rwlock's wait queue
-
 
240
      timeouts and the lock is held by at least one reader, the timeouting
-
 
241
      writer must first wake up all readers that follow him in the queue prior
-
 
242
      to signalling the timeout itself and giving up.</para>
-
 
243
 
-
 
244
      <para>Because of the issues mentioned in the previous paragraph, the
-
 
245
      reader/writer locks imlpementation needs to walk the rwlock wait queue's
-
 
246
      list of sleeping threads directly in order to find out the type of
-
 
247
      access that the queueing threads demand. This makes the code difficult
-
 
248
      to understand and dependent on the internal implementation of the wait
-
 
249
      queue. Nevertheless, it remains unclear to the authors of HelenOS
-
 
250
      whether a simpler but equivalently fair solution exists.</para>
-
 
251
 
-
 
252
      <para>The implementation of rwlocks as it has been already put, makes
-
 
253
      use of one single wait queue for both readers and writers, thus avoiding
-
 
254
      any possibility of starvation. In fact, rwlocks use a mutex rather than
-
 
255
      a bare wait queue. This mutex is called <emphasis>exclusive</emphasis>
-
 
256
      and is used to synchronize writers. The writer's lock operation,
-
 
257
      <emphasis>rwlock_write_lock_timeout</emphasis>, simply tries to acquire
-
 
258
      the exclusive mutex. If it succeeds, the writer is granted the rwlock.
-
 
259
      However, if the operation fails, the writer must check for potential
-
 
260
      readers at the head of the list of sleeping threads associated with the
-
 
261
      mutex's wait queue and proceed according to the procedure outlined
-
 
262
      above.</para>
-
 
263
 
-
 
264
      <para>The exclusive mutex plays an important role in reader
-
 
265
      synchronization as well. However, a reader doing the reader's lock
-
 
266
      operation, <emphasis>rwlock_read_lock_timeout</emphasis>, may bypass
-
 
267
      this mutex when it detects that:</para>
-
 
268
 
-
 
269
      <orderedlist>
-
 
270
        <listitem>
-
 
271
          <para>there are other readers in the critical section</para>
-
 
272
        </listitem>
-
 
273
 
-
 
274
        <listitem>
-
 
275
          <para>there are no sleeping threads waiting for the exclusive
-
 
276
          mutex</para>
-
 
277
        </listitem>
-
 
278
      </orderedlist>
-
 
279
 
-
 
280
      <para>If both conditions are true, the reader will bypass the mutex,
-
 
281
      increment the number of readers in the critical section and enter the
-
 
282
      critical section. Note that if there are any sleeping threads at the
-
 
283
      beginning of the wait queue, the first of them must be a writer. If the
-
 
284
      conditions are not fulfilled, the reader normally waits until the
-
 
285
      exclusive mutex is granted to it.</para>
131
    </section>
286
    </section>
132
 
287
 
133
    <section>
288
    <section>
134
      <title>Condition variables</title>
289
      <title>Condition variables</title>
135
 
290