Subversion Repositories HelenOS-doc

Rev

Rev 131 | Rev 140 | Go to most recent revision | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
9 bondari 1
<?xml version="1.0" encoding="UTF-8"?>
57 jermar 2
<chapter id="scheduling">
3
  <?dbhtml filename="scheduling.html"?>
9 bondari 4
 
57 jermar 5
  <title>Scheduling</title>
9 bondari 6
 
57 jermar 7
  <para>One of the key aims of the operating system is to create and support
8
  the impression that several activities are executing contemporarily. This is
9
  true for both uniprocessor as well as multiprocessor systems. In the case of
10
  multiprocessor systems, the activities are trully happening in parallel. The
11
  scheduler helps to materialize this impression by planning threads on as
12
  many processors as possible and, where this means reaches its limits, by
13
  quickly switching among threads executing on a single processor.</para>
14
 
15
  <section>
16
    <title>Contexts</title>
17
 
18
    <para>The term context refers to the set of processor resources that
19
    define the current state of the computation or the environment and the
20
    kernel understands it in several more or less narrow sences:</para>
21
 
22
    <itemizedlist>
23
      <listitem>
24
        <para>synchronous register context,</para>
25
      </listitem>
26
 
27
      <listitem>
28
        <para>asynchronous register context,</para>
29
      </listitem>
30
 
31
      <listitem>
32
        <para>FPU context and</para>
33
      </listitem>
34
 
35
      <listitem>
36
        <para>memory management context.</para>
37
      </listitem>
38
    </itemizedlist>
39
 
76 palkovsky 40
    <para>The most narrow sense refers to the the synchronous register
57 jermar 41
    context. It includes all the preserved registers as defined by the
42
    architecture. To highlight some, the program counter and stack pointer
76 palkovsky 43
    take part in the synchronous register context. These registers must be
44
    preserved across a procedure call and during synchronous context
45
    switches.</para>
57 jermar 46
 
47
    <para>The next type of the context understood by the kernel is the
48
    asynchronous register context. On an interrupt, the interrupted execution
49
    flow's state must be guaranteed to be eventually completely restored.
50
    Therefore the interrupt context includes, among other things, the scratch
51
    registers as defined by the architecture. As a special optimization and if
52
    certain conditions are met, it need not include the architecture's
53
    preserved registers. The condition mentioned in the previous sentence is
54
    that the low-level assembly language interrupt routines don't modify the
55
    preserved registers. The handlers usually call a higher-level C routine.
56
    The preserved registers are then saved on the stack by the compiler
57
    generated code of the higher-level function. In HelenOS, several
58
    architectures can be compiled with this optimization.</para>
59
 
60
    <para>Although the kernel does not do any floating point
61
    arithmetics<footnote>
62
        <para>Some architectures (e.g. ia64) inevitably use a fixed set of
64 jermar 63
        floating point registers to carry out their normal operations.</para>
57 jermar 64
      </footnote>, it must protect FPU context of userspace threads against
65
    destruction by other threads. Moreover, only a fraction of userspace
66
    programs use the floating point unit. HelenOS contains a generic framework
64 jermar 67
    for switching FPU context only when the switch is forced (i.e. a thread
68
    uses a floating point instruction and its FPU context is not loaded in the
69
    processor).</para>
57 jermar 70
 
71
    <para>The last member of the context family is the memory management
72
    context. It includes memory management registers that identify address
73
    spaces on hardware level (i.e. ASIDs and page tables pointers).</para>
74
 
9 bondari 75
    <section>
131 jermar 76
      <title>Synchronous Context Switches</title>
9 bondari 77
 
57 jermar 78
      <para>The scheduler, but also other pieces of the kernel, make heavy use
79
      of synchronous context switches, because it is a natural vehicle not
80
      only for changes in control flow, but also for switching between two
81
      kernel stacks. Two functions figure in a synchronous context switch
133 jermar 82
      implementation: <code>context_save()</code> and
83
      <code>context_restore()</code>. Note that these two functions break the
57 jermar 84
      natural perception of the linear C code execution flow starting at
85
      function's entry point and ending on one of the function's exit
86
      points.</para>
87
 
133 jermar 88
      <para>When the <code>context_save()</code> function is called, the
57 jermar 89
      synchronous context is saved in a memory structure passed to it. After
133 jermar 90
      executing <code>context_save()</code>, the caller is returned 1 as a
57 jermar 91
      return value. The execution of instructions continues as normally until
133 jermar 92
      <code>context_restore()</code> is called. For the caller, it seems like
57 jermar 93
      the call never returns<footnote>
94
          <para>Which might be a source of problems with variable liveliness
133 jermar 95
          after <code>context_restore()</code>.</para>
57 jermar 96
        </footnote>. Nevertheless, a synchronous register context, which is
133 jermar 97
      saved in a memory structure passed to <code>context_restore()</code>, is
57 jermar 98
      restored, thus transfering the control flow to the place of occurrence
133 jermar 99
      of the corresponding call to <code>context_save()</code>. From the
57 jermar 100
      perspective of the caller of the corresponding
133 jermar 101
      <code>context_save()</code>, it looks like a return from
102
      <code>context_save()</code>. However, this time a return value of 0 is
57 jermar 103
      returned.</para>
9 bondari 104
    </section>
57 jermar 105
  </section>
9 bondari 106
 
57 jermar 107
  <section>
58 jermar 108
    <title>Threads</title>
9 bondari 109
 
58 jermar 110
    <para>A thread is the basic executable entity with some code and stack.
111
    While the code, implemented by a C language function, can be shared by
112
    several threads, the stack is always private to each instance of the
113
    thread. Each thread belongs to exactly one task through which it shares
114
    address space with its sibling threads. Threads that execute purely in the
115
    kernel don't have any userspace memory allocated. However, when a thread
116
    has ambitions to run in userspace, it must be allocated a userspace stack.
117
    The distinction between the purely kernel threads and threads running also
118
    in userspace is made by refering to the former group as to kernel threads
119
    and to the latter group as to userspace threads. Both kernel and userspace
120
    threads are visible to the scheduler and can become a subject of kernel
121
    preemption and thread migration during times when preemption is
122
    possible.</para>
123
 
64 jermar 124
    <formalpara>
131 jermar 125
      <title>Thread States</title>
62 jermar 126
 
75 jermar 127
      <para>In each moment, a thread exists in one of six possible thread
64 jermar 128
      states. When the thread is created and first readied into the
129
      scheduler's run queues or when a thread is migrated to a new processor,
75 jermar 130
      it is put into the <constant>Entering</constant> state. After some time,
131
      the scheduler picks up the thread and starts executing it. A thread
132
      being currently executed on a processor is in the
133
      <constant>Running</constant> state. From there, the thread has three
134
      possibilities. It either runs until it is preemtped, in which case the
135
      state changes to <constant>Ready</constant>, goes to the
136
      <constant>Sleeping</constant> state by going to sleep or enters the
137
      <constant>Exiting</constant> state when it reaches termination. When the
138
      thread exits, its kernel structure usually stays in memory, until the
133 jermar 139
      thread is detached by another thread using <code>thread_detach()</code>
75 jermar 140
      function. Terminated but undetached threads are in the
141
      <constant>Undead</constant> state. When the thread is detached or
142
      detaches itself during its life, it is destroyed in the
143
      <constant>Exiting</constant> state and the <constant>Undead</constant>
101 bondari 144
      state is not reached.<figure float="1">
64 jermar 145
          <title>Transitions among thread states.</title>
61 jermar 146
 
64 jermar 147
          <mediaobject id="thread_states" xreflabel="">
87 bondari 148
            <imageobject role="pdf">
131 jermar 149
              <imagedata fileref="images/thread_states.pdf" format="PDF" />
77 bondari 150
            </imageobject>
151
 
64 jermar 152
            <imageobject role="html">
153
              <imagedata fileref="images/thread_states.png" format="PNG" />
154
            </imageobject>
62 jermar 155
 
64 jermar 156
            <imageobject role="fop">
131 jermar 157
              <imagedata fileref="images/thread_states.svg" format="SVG" />
64 jermar 158
            </imageobject>
159
          </mediaobject>
160
        </figure></para>
161
    </formalpara>
58 jermar 162
 
64 jermar 163
    <formalpara>
131 jermar 164
      <title>Pseudo Threads</title>
58 jermar 165
 
64 jermar 166
      <para>HelenOS userspace layer knows even smaller units of execution.
167
      Each userspace thread can make use of an arbitrary number of pseudo
168
      threads. These pseudo threads have their own synchronous register
169
      context, userspace code and stack. They live their own life within the
170
      userspace thread and the scheduler does not have any idea about them
171
      because they are completely implemented by the userspace library. This
172
      implies several things:<itemizedlist>
173
          <listitem>
174
            <para>pseudothreads schedule themselves cooperatively within the
175
            time slice given to their userspace thread,</para>
176
          </listitem>
58 jermar 177
 
64 jermar 178
          <listitem>
179
            <para>pseudothreads share FPU context of their containing thread
180
            and</para>
181
          </listitem>
182
 
183
          <listitem>
184
            <para>all pseudothreads of one userspace thread block when one of
185
            them goes to sleep.</para>
186
          </listitem>
187
        </itemizedlist></para>
188
    </formalpara>
59 jermar 189
  </section>
58 jermar 190
 
59 jermar 191
  <section>
192
    <title>Scheduler</title>
193
 
194
    <section>
131 jermar 195
      <title>Run Queues</title>
59 jermar 196
 
197
      <para>There is an array of several run queues on each processor. The
198
      current version of HelenOS uses 16 run queues implemented by 16 doubly
199
      linked lists. Each of the run queues is associated with thread priority.
200
      The lower the run queue index in the array is, the higher is the
201
      priority of threads linked in that run queue and the shorter is the time
202
      in which those threads will execute. When kernel code wants to access
203
      the run queue, it must first acquire its lock.</para>
204
    </section>
205
 
206
    <section>
131 jermar 207
      <title>Scheduler Operation</title>
59 jermar 208
 
209
      <para>The scheduler is invoked either explicitly when a thread calls the
133 jermar 210
      <code>scheduler()</code> function (e.g. goes to sleep or merely wants to
59 jermar 211
      relinquish the processor for a while) or implicitly on a periodic basis
212
      when the generic clock interrupt preempts the current thread. After its
213
      invocation, the scheduler saves the synchronous register context of the
214
      current thread and switches to its private stack. Afterwards, a new
215
      thread is selected according to the scheduling policy. If there is no
216
      suitable thread, the processor is idle and no thread executes on it.
217
      Note that the act of switching to the private scheduler stack is
218
      essential. If the processor kept running using the stack of the
219
      preempted thread it could damage it because the old thread can be
220
      migrated to another processor and scheduled there. In the worst case
62 jermar 221
      scenario, two execution flows would be using the same stack.</para>
59 jermar 222
 
223
      <para>The scheduling policy is implemented in function
224
      <code>find_best_thread</code>. This function walks the processor run
225
      queues from lower towards higher indices and looks for a thread. If the
226
      visited run queue is empty, it simply searches the next run queue. If it
227
      is known in advance that there are no ready threads waiting for
133 jermar 228
      execution, <code>find_best_thread()</code> interruptibly halts the
59 jermar 229
      processor or busy waits until some threads arrive. This process repeats
133 jermar 230
      until <code>find_best_thread()</code> succeeds.</para>
59 jermar 231
 
232
      <para>After the best thread is chosen, the scheduler switches to the
233
      thread's task and memory management context. Finally, the saved
234
      synchronous register context is restored and the thread runs. Each
235
      scheduled thread is given a time slice depending on its priority (i.e.
236
      run queue). The higher priority, the shorter timeslice. To summarize,
237
      this policy schedules threads with high priorities more frequently but
238
      gives them smaller time slices. On the other hand, lower priority
239
      threads are scheduled less frequently, but run for longer periods of
240
      time.</para>
241
 
242
      <para>When a thread uses its entire time slice, it is preempted and put
243
      back into the run queue that immediately follows the previous run queue
244
      from which the thread ran. Threads that are woken up from a sleep are
245
      put into the biggest priority run queue. Low priority threads are
246
      therefore those that don't go to sleep so often and just occupy the
247
      processor.</para>
248
 
249
      <para>In order to avoid complete starvation of the low priority threads,
250
      from time to time, the scheduler will provide them with a bonus of one
251
      point priority increase. In other words, the scheduler will now and then
252
      move the entire run queues one level up.</para>
253
    </section>
254
 
255
    <section>
131 jermar 256
      <title>Processor Load Balancing</title>
59 jermar 257
 
258
      <para>Normally, for the sake of cache locality, threads are scheduled on
259
      one of the processors and don't leave it. Nevertheless, a situation in
260
      which one processor is heavily overloaded while others sit idle can
76 palkovsky 261
      occur. HelenOS deploys special kernel threads to help mitigate this
59 jermar 262
      problem. Each processor is associated with one load balancing thread
76 palkovsky 263
      called <code>kcpulb</code> that wakes up regularly to see whether its
264
      processor is underbalanced or not. If it is, the thread attempts to
59 jermar 265
      migrate threads from other overloaded processors to its own processor's
266
      run queues. When the job is done or there is no need for load balancing,
267
      the thread goes to sleep.</para>
268
 
269
      <para>The balancing threads operate very gently and try to migrate low
270
      priority threads first; one <code>kcpulb</code> never takes from one
271
      processor twice in a row. The load balancing threads as well as threads
272
      that were just stolen cannot be migrated. The <code>kcpulb</code>
273
      threads are wired to their processors and cannot be migrated whatsoever.
274
      The ordinary threads are protected only until they are
275
      rescheduled.</para>
276
    </section>
57 jermar 277
  </section>
278
</chapter>