Rev 41 | Rev 44 | Go to most recent revision | Show entire file | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed
Rev 41 | Rev 43 | ||
---|---|---|---|
Line 35... | Line 35... | ||
35 | atomic operation and the spinlock algortihm. While the naive algorithm |
35 | atomic operation and the spinlock algortihm. While the naive algorithm |
36 | is prone to race conditions on SMP configuratinos and thus is completely |
36 | is prone to race conditions on SMP configuratinos and thus is completely |
37 | SMP-unsafe, the spinlock algorithm eliminates the possibility of race |
37 | SMP-unsafe, the spinlock algorithm eliminates the possibility of race |
38 | conditions and is suitable for mutual exclusion use.</para> |
38 | conditions and is suitable for mutual exclusion use.</para> |
39 | 39 | ||
40 | <para></para> |
- | |
41 | - | ||
42 | <para>The semantics of the test-and-set operation is that the spinlock |
40 | <para>The semantics of the test-and-set operation is that the spinlock |
43 | remains unavailable until this operation called on the respective |
41 | remains unavailable until this operation called on the respective |
44 | spinlock returns zero. HelenOS builds two functions on top of |
42 | spinlock returns zero. HelenOS builds two functions on top of |
45 | test-and-set operation. The first is the unconditional attempt to |
43 | test-and-set operation. The first is the unconditional attempt to |
46 | acquire the spinlock and is called <emphasis>spinlock_lock</emphasis>. |
44 | acquire the spinlock and is called <emphasis>spinlock_lock</emphasis>. |
Line 54... | Line 52... | ||
54 | the system it would simply release some spinlocks it already holds and |
52 | the system it would simply release some spinlocks it already holds and |
55 | retry the whole operation with the hope that it will succeed next time. |
53 | retry the whole operation with the hope that it will succeed next time. |
56 | The unlock operation, <emphasis>spinlock_unlock</emphasis>, is quite |
54 | The unlock operation, <emphasis>spinlock_unlock</emphasis>, is quite |
57 | easy - it merely clears the spinlock variable.</para> |
55 | easy - it merely clears the spinlock variable.</para> |
58 | 56 | ||
59 | <para></para> |
- | |
60 | - | ||
61 | <para>Nevertheless, there is a special issue related to hardware |
57 | <para>Nevertheless, there is a special issue related to hardware |
62 | optimizations that modern processors implement. Particularily |
58 | optimizations that modern processors implement. Particularily |
63 | problematic is the out-of-order execution of instructions within the |
59 | problematic is the out-of-order execution of instructions within the |
64 | critical section protected by a spinlock. The processors are always |
60 | critical section protected by a spinlock. The processors are always |
65 | self-consistent so that they can carry out speculatively executed |
61 | self-consistent so that they can carry out speculatively executed |
Line 77... | Line 73... | ||
77 | because of the special properties of locking and unlocking instructions. |
73 | because of the special properties of locking and unlocking instructions. |
78 | However, other architectures need to instrument these hooks with |
74 | However, other architectures need to instrument these hooks with |
79 | different memory barriers, depending on what operations can bleed |
75 | different memory barriers, depending on what operations can bleed |
80 | out.</para> |
76 | out.</para> |
81 | 77 | ||
82 | <para></para> |
- | |
83 | - | ||
84 | <para>Spinlocks have one significant drawback: when held for longer time |
78 | <para>Spinlocks have one significant drawback: when held for longer time |
85 | periods, they harm both parallelism and concurrency. Processor executing |
79 | periods, they harm both parallelism and concurrency. Processor executing |
86 | <emphasis>spinlock_lock</emphasis> does not do any fruitful work and is |
80 | <emphasis>spinlock_lock</emphasis> does not do any fruitful work and is |
87 | effectively halted until it can grab the lock and proceed. Similarily, |
81 | effectively halted until it can grab the lock and proceed. Similarily, |
88 | other threads cannot execute on the processor that holds the spinlock |
82 | other threads cannot execute on the processor that holds the spinlock |
89 | because the kernel disables preemption on that processor when a spinlock |
83 | because the kernel disables preemption on that processor when a spinlock |
90 | is held. The reason behind disabling preemption is priority inversion |
84 | is held. The reason behind disabling preemption is priority inversion |
91 | problem avoidance. For the same reason, threads are strongly discouraged |
85 | problem avoidance. For the same reason, threads are strongly discouraged |
92 | from sleeping when they hold a spinlock.</para> |
86 | from sleeping when they hold a spinlock.</para> |
93 | 87 | ||
94 | <para></para> |
- | |
95 | - | ||
96 | <para>To summarize, spinlocks represent very simple and essential mutual |
88 | <para>To summarize, spinlocks represent very simple and essential mutual |
97 | exclusion primitive for SMP systems. On the other hand, spinlocks scale |
89 | exclusion primitive for SMP systems. On the other hand, spinlocks scale |
98 | poorly because of the active loop they are based on. Therefore, |
90 | poorly because of the active loop they are based on. Therefore, |
99 | spinlocks are used in HelenOS only for a short-time mutual exclusion and |
91 | spinlocks are used in HelenOS only for a short-time mutual exclusion and |
100 | in cases where the mutual exclusion is required out of thread context. |
92 | in cases where the mutual exclusion is required out of thread context. |
Line 105... | Line 97... | ||
105 | 97 | ||
106 | <section> |
98 | <section> |
107 | <title>Passive kernel synchronization</title> |
99 | <title>Passive kernel synchronization</title> |
108 | 100 | ||
109 | <section> |
101 | <section> |
110 | <title>Mutexes</title> |
102 | <title>Wait queues</title> |
111 | 103 | ||
- | 104 | <para>A wait queue is the basic passive synchronization primitive on |
|
- | 105 | which all other passive synchronization primitives build. Simply put, it |
|
- | 106 | allows a thread to sleep until an event associated with the particular |
|
- | 107 | wait queue occurs. Multiple threads are notified about incoming events |
|
- | 108 | in first come, first served fashion. Moreover, should the event come |
|
- | 109 | before any thread waits for it, it is recorded in the wait queue as a |
|
- | 110 | missed wakeup and later forwarded to the first thread that decides to |
|
- | 111 | wait in the queue. The inner structures of the wait queue are protected |
|
112 | <para>Mutex explanations</para> |
112 | by a spinlock.</para> |
- | 113 | ||
- | 114 | <para>The thread that wants to wait for a wait queue event uses the |
|
- | 115 | <emphasis>waitq_sleep_timeout</emphasis> function. The algorithm then |
|
- | 116 | checks the wait queue's counter of missed wakeups and if there are any |
|
- | 117 | missed wakeups, the call returns immediately. The call also returns |
|
- | 118 | immediately if only a conditional wait was requested. Otherwise the |
|
- | 119 | thread is enqueued in the wait queue's list of sleeping threads and its |
|
- | 120 | state is changed to <emphasis>Sleeping</emphasis>. It then sleeps until |
|
- | 121 | one of the following events happens:</para> |
|
- | 122 | ||
- | 123 | <orderedlist> |
|
- | 124 | <listitem> |
|
- | 125 | <para>another thread calls <emphasis>waitq_wakeup</emphasis> and the |
|
- | 126 | thread is the first thread in the wait queue's list of sleeping |
|
- | 127 | threads</para> |
|
- | 128 | </listitem> |
|
- | 129 | ||
- | 130 | <listitem> |
|
- | 131 | <para>another thread calls |
|
- | 132 | <emphasis>waitq_interrupt_sleep</emphasis> on the sleeping |
|
- | 133 | thread</para> |
|
- | 134 | </listitem> |
|
- | 135 | ||
- | 136 | <listitem> |
|
- | 137 | <para>the sleep timeouts provided that none of the previous occurred |
|
- | 138 | within a specified time limit; the limit can be infinity</para> |
|
- | 139 | </listitem> |
|
- | 140 | </orderedlist> |
|
- | 141 | ||
- | 142 | <para>All five possibilities (immediate return on success, immediate |
|
- | 143 | return on failure, wakeup after sleep, interruption and timeout) are |
|
- | 144 | distinguishable by the return value of |
|
- | 145 | <emphasis>waitq_sleep_timeout</emphasis>. The ability to interrupt a |
|
- | 146 | sleeping thread is essential for externally initiated thread termination |
|
- | 147 | and the ability to wait only for a certain amount of time is used, for |
|
- | 148 | instance, to passively delay thread execution by several microseconds or |
|
- | 149 | even seconds in <emphasis>thread_sleep</emphasis> function. Because all |
|
- | 150 | other passive kernel synchronization primitives are based on wait |
|
- | 151 | queues, they also have the option of being interrutped and, more |
|
- | 152 | importantly, can timeout. All of them also implement the conditional |
|
- | 153 | operation. Furthemore, this very fundamental interface reaches up to the |
|
- | 154 | implementation of futexes - userspace synchronization primitive, which |
|
- | 155 | makes it possible for a userspace thread to request synchronization |
|
- | 156 | operation with a timeout or a conditional operation.</para> |
|
- | 157 | ||
- | 158 | <para>From the description above, it should be apparent, that when a |
|
- | 159 | sleeping thread is woken by <emphasis>waitq_wakeup</emphasis> or when |
|
- | 160 | <emphasis>waitq_sleep_timeout</emphasis> succeeds immediatelly, the |
|
- | 161 | thread can be sure the event has come and the thread need not and should |
|
- | 162 | not verify this fact. This approach is called direct hand-off and is |
|
- | 163 | characteristic for all passive HelenOS synchronization primitives with |
|
- | 164 | one exception described below.</para> |
|
113 | </section> |
165 | </section> |
114 | 166 | ||
115 | <section> |
167 | <section> |
116 | <title>Semaphores</title> |
168 | <title>Semaphores</title> |
117 | 169 | ||
- | 170 | <para>The interesting point about wait queues is that the number of |
|
- | 171 | missed wakeups is equal to the number of threads that will not block in |
|
- | 172 | <emphasis>watiq_sleep_timeout</emphasis> and would immediately succeed |
|
- | 173 | instead. On the other hand, semaphores are synchronization primitives |
|
- | 174 | that will let predefined amount of threads in its critical section and |
|
- | 175 | block any other threads above this count. However, these two cases are |
|
- | 176 | exactly the same. Semaphores in HelenOS are therefore implemented as |
|
- | 177 | wait queues with a single semantic change: their wait queue is |
|
- | 178 | initialized to have so many missed wakeups as is the number of threads |
|
- | 179 | that the semphore intends to let into its critical section |
|
- | 180 | simultaneously.</para> |
|
- | 181 | ||
118 | <para>Semaphore explanations</para> |
182 | <para>In the semaphore language, the wait queue operation |
- | 183 | <emphasis>waitq_sleep_timeout</emphasis> corresponds to |
|
- | 184 | <emphasis><emphasis>semaphore</emphasis> down</emphasis> operation, |
|
- | 185 | represented by the function <emphasis>semaphore_down_timeout</emphasis> |
|
- | 186 | and by way of similitude the wait queue operation waitq_wakeup |
|
- | 187 | corresponds to semaphore <emphasis>up</emphasis> operation, represented |
|
- | 188 | by the function <emphasis>sempafore_up</emphasis>. The conditional down |
|
- | 189 | operation is called <emphasis>semaphore_trydown</emphasis>.</para> |
|
119 | </section> |
190 | </section> |
120 | 191 | ||
121 | <section> |
192 | <section> |
122 | <title>Read/Write Locks</title> |
193 | <title>Mutexes</title> |
123 | 194 | ||
- | 195 | <para>Mutexes are are sometimes referred to as binary sempahores. That |
|
- | 196 | means that mutexes are like semaphores that allow only one thread in its |
|
- | 197 | critical section. Indeed, mutexes in HelenOS are implemented exactly in |
|
- | 198 | this way: they are built atop semaphores. From another point of view, |
|
- | 199 | they can be viewed as spinlocks without busy waiting. Their semaphore |
|
- | 200 | heritage provides good basics for both conditional operation and |
|
- | 201 | operation with timeout. The locking operation is called |
|
- | 202 | <emphasis>mutex_lock</emphasis>, the conditional locking operation is |
|
- | 203 | called <emphasis>mutex_trylock</emphasis> and the unlocking operation is |
|
124 | <para>RWLocks explanation</para> |
204 | called <emphasis>mutex_unlock</emphasis>.</para> |
125 | </section> |
205 | </section> |
126 | 206 | ||
127 | <section> |
207 | <section> |
128 | <title>Wait queues</title> |
208 | <title>Reader/writer locks</title> |
129 | 209 | ||
- | 210 | <para>Reader/writer locks, or rwlocks, are by far the most complicated |
|
- | 211 | synchronization primitive within the kernel. The goal of these locks is |
|
- | 212 | to improve concurrency of applications in which threads need to |
|
- | 213 | synchronize access to a shared resource and that access can be |
|
- | 214 | partitioned into a read-only mode and a write mode. Reader/writer locks |
|
- | 215 | should make it possible for several, possibly many, readers to enter the |
|
- | 216 | critical section, provided that no writer is currently in the critical |
|
- | 217 | section, or to be in the critical section contemporarily. Writers are |
|
- | 218 | allowed to enter the critical section only individually, provided that |
|
- | 219 | no reader is in the critical section already. Applications in which the |
|
- | 220 | majority of operations can be done in the read-only mode can benefit |
|
- | 221 | from increased concurrency created by reader/writer locks.</para> |
|
- | 222 | ||
- | 223 | <para>During reader/writer locks construction, a decision should be made |
|
- | 224 | whether readers will be prefered over writers or whether writers will be |
|
- | 225 | prefered over readers in cases when the lock is not currently held and |
|
- | 226 | both a reader and a writer want to gain the lock. Some operating systems |
|
- | 227 | prefer one group over the other, creating thus a possibility for |
|
- | 228 | starving the unprefered group. In the HelenOS operating system, none of |
|
- | 229 | the two groups is prefered. The lock is granted on the first come, first |
|
- | 230 | served basis with the additional note that readers are granted the lock |
|
130 | <para>Wait queue explanation</para> |
231 | in biggest possible batches.</para> |
- | 232 | ||
- | 233 | <para>With this policy and the timeout modes of operation, the direct |
|
- | 234 | hand-off becomes much more complicated. For instance, a writer leaving |
|
- | 235 | the critical section must wake up all leading readers in the rwlock's |
|
- | 236 | wait queue or one leading writer or no-one if no thread is waiting. |
|
- | 237 | Similarily, the last reader leaving the critical section must wakeup the |
|
- | 238 | sleeping writer, if there are any sleeping threads at all. As another |
|
- | 239 | example, if a writer at the beginning of the rwlock's wait queue |
|
- | 240 | timeouts and the lock is held by at least one reader, the timeouting |
|
- | 241 | writer must first wake up all readers that follow him in the queue prior |
|
- | 242 | to signalling the timeout itself and giving up.</para> |
|
- | 243 | ||
- | 244 | <para>Because of the issues mentioned in the previous paragraph, the |
|
- | 245 | reader/writer locks imlpementation needs to walk the rwlock wait queue's |
|
- | 246 | list of sleeping threads directly in order to find out the type of |
|
- | 247 | access that the queueing threads demand. This makes the code difficult |
|
- | 248 | to understand and dependent on the internal implementation of the wait |
|
- | 249 | queue. Nevertheless, it remains unclear to the authors of HelenOS |
|
- | 250 | whether a simpler but equivalently fair solution exists.</para> |
|
- | 251 | ||
- | 252 | <para>The implementation of rwlocks as it has been already put, makes |
|
- | 253 | use of one single wait queue for both readers and writers, thus avoiding |
|
- | 254 | any possibility of starvation. In fact, rwlocks use a mutex rather than |
|
- | 255 | a bare wait queue. This mutex is called <emphasis>exclusive</emphasis> |
|
- | 256 | and is used to synchronize writers. The writer's lock operation, |
|
- | 257 | <emphasis>rwlock_write_lock_timeout</emphasis>, simply tries to acquire |
|
- | 258 | the exclusive mutex. If it succeeds, the writer is granted the rwlock. |
|
- | 259 | However, if the operation fails, the writer must check for potential |
|
- | 260 | readers at the head of the list of sleeping threads associated with the |
|
- | 261 | mutex's wait queue and proceed according to the procedure outlined |
|
- | 262 | above.</para> |
|
- | 263 | ||
- | 264 | <para>The exclusive mutex plays an important role in reader |
|
- | 265 | synchronization as well. However, a reader doing the reader's lock |
|
- | 266 | operation, <emphasis>rwlock_read_lock_timeout</emphasis>, may bypass |
|
- | 267 | this mutex when it detects that:</para> |
|
- | 268 | ||
- | 269 | <orderedlist> |
|
- | 270 | <listitem> |
|
- | 271 | <para>there are other readers in the critical section</para> |
|
- | 272 | </listitem> |
|
- | 273 | ||
- | 274 | <listitem> |
|
- | 275 | <para>there are no sleeping threads waiting for the exclusive |
|
- | 276 | mutex</para> |
|
- | 277 | </listitem> |
|
- | 278 | </orderedlist> |
|
- | 279 | ||
- | 280 | <para>If both conditions are true, the reader will bypass the mutex, |
|
- | 281 | increment the number of readers in the critical section and enter the |
|
- | 282 | critical section. Note that if there are any sleeping threads at the |
|
- | 283 | beginning of the wait queue, the first of them must be a writer. If the |
|
- | 284 | conditions are not fulfilled, the reader normally waits until the |
|
- | 285 | exclusive mutex is granted to it.</para> |
|
131 | </section> |
286 | </section> |
132 | 287 | ||
133 | <section> |
288 | <section> |
134 | <title>Condition variables</title> |
289 | <title>Condition variables</title> |
135 | 290 |