Subversion Repositories HelenOS-doc

Rev

Rev 64 | Rev 76 | Go to most recent revision | Only display areas with differences | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed

Rev 64 Rev 75
1
<?xml version="1.0" encoding="UTF-8"?>
1
<?xml version="1.0" encoding="UTF-8"?>
2
<chapter id="scheduling">
2
<chapter id="scheduling">
3
  <?dbhtml filename="scheduling.html"?>
3
  <?dbhtml filename="scheduling.html"?>
4
 
4
 
5
  <title>Scheduling</title>
5
  <title>Scheduling</title>
6
 
6
 
7
  <para>One of the key aims of the operating system is to create and support
7
  <para>One of the key aims of the operating system is to create and support
8
  the impression that several activities are executing contemporarily. This is
8
  the impression that several activities are executing contemporarily. This is
9
  true for both uniprocessor as well as multiprocessor systems. In the case of
9
  true for both uniprocessor as well as multiprocessor systems. In the case of
10
  multiprocessor systems, the activities are trully happening in parallel. The
10
  multiprocessor systems, the activities are trully happening in parallel. The
11
  scheduler helps to materialize this impression by planning threads on as
11
  scheduler helps to materialize this impression by planning threads on as
12
  many processors as possible and, where this means reaches its limits, by
12
  many processors as possible and, where this means reaches its limits, by
13
  quickly switching among threads executing on a single processor.</para>
13
  quickly switching among threads executing on a single processor.</para>
14
 
14
 
15
  <section>
15
  <section>
16
    <title>Contexts</title>
16
    <title>Contexts</title>
17
 
17
 
18
    <para>The term context refers to the set of processor resources that
18
    <para>The term context refers to the set of processor resources that
19
    define the current state of the computation or the environment and the
19
    define the current state of the computation or the environment and the
20
    kernel understands it in several more or less narrow sences:</para>
20
    kernel understands it in several more or less narrow sences:</para>
21
 
21
 
22
    <itemizedlist>
22
    <itemizedlist>
23
      <listitem>
23
      <listitem>
24
        <para>synchronous register context,</para>
24
        <para>synchronous register context,</para>
25
      </listitem>
25
      </listitem>
26
 
26
 
27
      <listitem>
27
      <listitem>
28
        <para>asynchronous register context,</para>
28
        <para>asynchronous register context,</para>
29
      </listitem>
29
      </listitem>
30
 
30
 
31
      <listitem>
31
      <listitem>
32
        <para>FPU context and</para>
32
        <para>FPU context and</para>
33
      </listitem>
33
      </listitem>
34
 
34
 
35
      <listitem>
35
      <listitem>
36
        <para>memory management context.</para>
36
        <para>memory management context.</para>
37
      </listitem>
37
      </listitem>
38
    </itemizedlist>
38
    </itemizedlist>
39
 
39
 
40
    <para>The most narrow sence refers to the the synchronous register
40
    <para>The most narrow sence refers to the the synchronous register
41
    context. It includes all the preserved registers as defined by the
41
    context. It includes all the preserved registers as defined by the
42
    architecture. To highlight some, the program counter and stack pointer
42
    architecture. To highlight some, the program counter and stack pointer
43
    take part in the synchronous register context. These are the registers
43
    take part in the synchronous register context. These are the registers
44
    that must be preserved across a procedure call and during synchronous
44
    that must be preserved across a procedure call and during synchronous
45
    context switches.</para>
45
    context switches.</para>
46
 
46
 
47
    <para>The next type of the context understood by the kernel is the
47
    <para>The next type of the context understood by the kernel is the
48
    asynchronous register context. On an interrupt, the interrupted execution
48
    asynchronous register context. On an interrupt, the interrupted execution
49
    flow's state must be guaranteed to be eventually completely restored.
49
    flow's state must be guaranteed to be eventually completely restored.
50
    Therefore the interrupt context includes, among other things, the scratch
50
    Therefore the interrupt context includes, among other things, the scratch
51
    registers as defined by the architecture. As a special optimization and if
51
    registers as defined by the architecture. As a special optimization and if
52
    certain conditions are met, it need not include the architecture's
52
    certain conditions are met, it need not include the architecture's
53
    preserved registers. The condition mentioned in the previous sentence is
53
    preserved registers. The condition mentioned in the previous sentence is
54
    that the low-level assembly language interrupt routines don't modify the
54
    that the low-level assembly language interrupt routines don't modify the
55
    preserved registers. The handlers usually call a higher-level C routine.
55
    preserved registers. The handlers usually call a higher-level C routine.
56
    The preserved registers are then saved on the stack by the compiler
56
    The preserved registers are then saved on the stack by the compiler
57
    generated code of the higher-level function. In HelenOS, several
57
    generated code of the higher-level function. In HelenOS, several
58
    architectures can be compiled with this optimization.</para>
58
    architectures can be compiled with this optimization.</para>
59
 
59
 
60
    <para>Although the kernel does not do any floating point
60
    <para>Although the kernel does not do any floating point
61
    arithmetics<footnote>
61
    arithmetics<footnote>
62
        <para>Some architectures (e.g. ia64) inevitably use a fixed set of
62
        <para>Some architectures (e.g. ia64) inevitably use a fixed set of
63
        floating point registers to carry out their normal operations.</para>
63
        floating point registers to carry out their normal operations.</para>
64
      </footnote>, it must protect FPU context of userspace threads against
64
      </footnote>, it must protect FPU context of userspace threads against
65
    destruction by other threads. Moreover, only a fraction of userspace
65
    destruction by other threads. Moreover, only a fraction of userspace
66
    programs use the floating point unit. HelenOS contains a generic framework
66
    programs use the floating point unit. HelenOS contains a generic framework
67
    for switching FPU context only when the switch is forced (i.e. a thread
67
    for switching FPU context only when the switch is forced (i.e. a thread
68
    uses a floating point instruction and its FPU context is not loaded in the
68
    uses a floating point instruction and its FPU context is not loaded in the
69
    processor).</para>
69
    processor).</para>
70
 
70
 
71
    <para>The last member of the context family is the memory management
71
    <para>The last member of the context family is the memory management
72
    context. It includes memory management registers that identify address
72
    context. It includes memory management registers that identify address
73
    spaces on hardware level (i.e. ASIDs and page tables pointers).</para>
73
    spaces on hardware level (i.e. ASIDs and page tables pointers).</para>
74
 
74
 
75
    <section>
75
    <section>
76
      <title>Synchronous context switches</title>
76
      <title>Synchronous context switches</title>
77
 
77
 
78
      <para>The scheduler, but also other pieces of the kernel, make heavy use
78
      <para>The scheduler, but also other pieces of the kernel, make heavy use
79
      of synchronous context switches, because it is a natural vehicle not
79
      of synchronous context switches, because it is a natural vehicle not
80
      only for changes in control flow, but also for switching between two
80
      only for changes in control flow, but also for switching between two
81
      kernel stacks. Two functions figure in a synchronous context switch
81
      kernel stacks. Two functions figure in a synchronous context switch
82
      implementation: <code>context_save</code> and
82
      implementation: <code>context_save</code> and
83
      <code>context_restore</code>. Note that these two functions break the
83
      <code>context_restore</code>. Note that these two functions break the
84
      natural perception of the linear C code execution flow starting at
84
      natural perception of the linear C code execution flow starting at
85
      function's entry point and ending on one of the function's exit
85
      function's entry point and ending on one of the function's exit
86
      points.</para>
86
      points.</para>
87
 
87
 
88
      <para>When the <code>context_save</code> function is called, the
88
      <para>When the <code>context_save</code> function is called, the
89
      synchronous context is saved in a memory structure passed to it. After
89
      synchronous context is saved in a memory structure passed to it. After
90
      executing <code>context_save</code>, the caller is returned 1 as a
90
      executing <code>context_save</code>, the caller is returned 1 as a
91
      return value. The execution of instructions continues as normally until
91
      return value. The execution of instructions continues as normally until
92
      <code>context_restore</code> is called. For the caller, it seems like
92
      <code>context_restore</code> is called. For the caller, it seems like
93
      the call never returns<footnote>
93
      the call never returns<footnote>
94
          <para>Which might be a source of problems with variable liveliness
94
          <para>Which might be a source of problems with variable liveliness
95
          after <code>context_restore</code>.</para>
95
          after <code>context_restore</code>.</para>
96
        </footnote>. Nevertheless, a synchronous register context, which is
96
        </footnote>. Nevertheless, a synchronous register context, which is
97
      saved in a memory structure passed to <code>context_restore,</code> is
97
      saved in a memory structure passed to <code>context_restore,</code> is
98
      restored, thus transfering the control flow to the place of occurrence
98
      restored, thus transfering the control flow to the place of occurrence
99
      of the corresponding call to <code>context_save</code>. From the
99
      of the corresponding call to <code>context_save</code>. From the
100
      perspective of the caller of the corresponding
100
      perspective of the caller of the corresponding
101
      <code>context_save</code>, it looks as though a return from
101
      <code>context_save</code>, it looks as though a return from
102
      <code>context_save</code>. However, this time a return value of 0 is
102
      <code>context_save</code>. However, this time a return value of 0 is
103
      returned.</para>
103
      returned.</para>
104
    </section>
104
    </section>
105
  </section>
105
  </section>
106
 
106
 
107
  <section>
107
  <section>
108
    <title>Threads</title>
108
    <title>Threads</title>
109
 
109
 
110
    <para>A thread is the basic executable entity with some code and stack.
110
    <para>A thread is the basic executable entity with some code and stack.
111
    While the code, implemented by a C language function, can be shared by
111
    While the code, implemented by a C language function, can be shared by
112
    several threads, the stack is always private to each instance of the
112
    several threads, the stack is always private to each instance of the
113
    thread. Each thread belongs to exactly one task through which it shares
113
    thread. Each thread belongs to exactly one task through which it shares
114
    address space with its sibling threads. Threads that execute purely in the
114
    address space with its sibling threads. Threads that execute purely in the
115
    kernel don't have any userspace memory allocated. However, when a thread
115
    kernel don't have any userspace memory allocated. However, when a thread
116
    has ambitions to run in userspace, it must be allocated a userspace stack.
116
    has ambitions to run in userspace, it must be allocated a userspace stack.
117
    The distinction between the purely kernel threads and threads running also
117
    The distinction between the purely kernel threads and threads running also
118
    in userspace is made by refering to the former group as to kernel threads
118
    in userspace is made by refering to the former group as to kernel threads
119
    and to the latter group as to userspace threads. Both kernel and userspace
119
    and to the latter group as to userspace threads. Both kernel and userspace
120
    threads are visible to the scheduler and can become a subject of kernel
120
    threads are visible to the scheduler and can become a subject of kernel
121
    preemption and thread migration during times when preemption is
121
    preemption and thread migration during times when preemption is
122
    possible.</para>
122
    possible.</para>
123
 
123
 
124
    <formalpara>
124
    <formalpara>
125
      <title>Thread states</title>
125
      <title>Thread states</title>
126
 
126
 
127
      <para>In each moment, a thread exists in one of five possible thread
127
      <para>In each moment, a thread exists in one of six possible thread
128
      states. When the thread is created and first readied into the
128
      states. When the thread is created and first readied into the
129
      scheduler's run queues or when a thread is migrated to a new processor,
129
      scheduler's run queues or when a thread is migrated to a new processor,
130
      it is put into the <code>Entering</code> state. After some time, the
130
      it is put into the <constant>Entering</constant> state. After some time,
131
      scheduler picks up the thread and starts executing it. A thread being
131
      the scheduler picks up the thread and starts executing it. A thread
132
      currently executed on a processor is in the <code>Running</code> state.
132
      being currently executed on a processor is in the
133
      From there, the thread has three possibilities. It either runs until it
133
      <constant>Running</constant> state. From there, the thread has three
134
      is preemtped, in which case the state changes to <code>Ready</code>,
134
      possibilities. It either runs until it is preemtped, in which case the
-
 
135
      state changes to <constant>Ready</constant>, goes to the
135
      goes to the <code>Sleeping</code> state by going to sleep or enters the
136
      <constant>Sleeping</constant> state by going to sleep or enters the
136
      <code>Exiting</code> state when it reaches termination.<figure>
137
      <constant>Exiting</constant> state when it reaches termination. When the
-
 
138
      thread exits, its kernel structure usually stays in memory, until the
-
 
139
      thread is detached by another thread using <code>thread_detach</code>
-
 
140
      function. Terminated but undetached threads are in the
-
 
141
      <constant>Undead</constant> state. When the thread is detached or
-
 
142
      detaches itself during its life, it is destroyed in the
-
 
143
      <constant>Exiting</constant> state and the <constant>Undead</constant>
-
 
144
      state is not reached.<figure>
137
          <title>Transitions among thread states.</title>
145
          <title>Transitions among thread states.</title>
138
 
146
 
139
          <mediaobject id="thread_states" xreflabel="">
147
          <mediaobject id="thread_states" xreflabel="">
140
            <imageobject role="html">
148
            <imageobject role="html">
141
              <imagedata fileref="images/thread_states.png" format="PNG" />
149
              <imagedata fileref="images/thread_states.png" format="PNG" />
142
            </imageobject>
150
            </imageobject>
143
 
151
 
144
            <imageobject role="fop">
152
            <imageobject role="fop">
145
              <imagedata fileref="images.vector/thread_states.svg"
153
              <imagedata fileref="images.vector/thread_states.svg"
146
                         format="SVG" />
154
                         format="SVG" />
147
            </imageobject>
155
            </imageobject>
148
          </mediaobject>
156
          </mediaobject>
149
        </figure></para>
157
        </figure></para>
150
    </formalpara>
158
    </formalpara>
151
 
159
 
152
    <formalpara>
160
    <formalpara>
153
      <title>Pseudo threads</title>
161
      <title>Pseudo threads</title>
154
 
162
 
155
      <para>HelenOS userspace layer knows even smaller units of execution.
163
      <para>HelenOS userspace layer knows even smaller units of execution.
156
      Each userspace thread can make use of an arbitrary number of pseudo
164
      Each userspace thread can make use of an arbitrary number of pseudo
157
      threads. These pseudo threads have their own synchronous register
165
      threads. These pseudo threads have their own synchronous register
158
      context, userspace code and stack. They live their own life within the
166
      context, userspace code and stack. They live their own life within the
159
      userspace thread and the scheduler does not have any idea about them
167
      userspace thread and the scheduler does not have any idea about them
160
      because they are completely implemented by the userspace library. This
168
      because they are completely implemented by the userspace library. This
161
      implies several things:<itemizedlist>
169
      implies several things:<itemizedlist>
162
          <listitem>
170
          <listitem>
163
            <para>pseudothreads schedule themselves cooperatively within the
171
            <para>pseudothreads schedule themselves cooperatively within the
164
            time slice given to their userspace thread,</para>
172
            time slice given to their userspace thread,</para>
165
          </listitem>
173
          </listitem>
166
 
174
 
167
          <listitem>
175
          <listitem>
168
            <para>pseudothreads share FPU context of their containing thread
176
            <para>pseudothreads share FPU context of their containing thread
169
            and</para>
177
            and</para>
170
          </listitem>
178
          </listitem>
171
 
179
 
172
          <listitem>
180
          <listitem>
173
            <para>all pseudothreads of one userspace thread block when one of
181
            <para>all pseudothreads of one userspace thread block when one of
174
            them goes to sleep.</para>
182
            them goes to sleep.</para>
175
          </listitem>
183
          </listitem>
176
        </itemizedlist></para>
184
        </itemizedlist></para>
177
    </formalpara>
185
    </formalpara>
178
  </section>
186
  </section>
179
 
187
 
180
  <section>
188
  <section>
181
    <title>Scheduler</title>
189
    <title>Scheduler</title>
182
 
190
 
183
    <section>
191
    <section>
184
      <title>Run queues</title>
192
      <title>Run queues</title>
185
 
193
 
186
      <para>There is an array of several run queues on each processor. The
194
      <para>There is an array of several run queues on each processor. The
187
      current version of HelenOS uses 16 run queues implemented by 16 doubly
195
      current version of HelenOS uses 16 run queues implemented by 16 doubly
188
      linked lists. Each of the run queues is associated with thread priority.
196
      linked lists. Each of the run queues is associated with thread priority.
189
      The lower the run queue index in the array is, the higher is the
197
      The lower the run queue index in the array is, the higher is the
190
      priority of threads linked in that run queue and the shorter is the time
198
      priority of threads linked in that run queue and the shorter is the time
191
      in which those threads will execute. When kernel code wants to access
199
      in which those threads will execute. When kernel code wants to access
192
      the run queue, it must first acquire its lock.</para>
200
      the run queue, it must first acquire its lock.</para>
193
    </section>
201
    </section>
194
 
202
 
195
    <section>
203
    <section>
196
      <title>Scheduler operation</title>
204
      <title>Scheduler operation</title>
197
 
205
 
198
      <para>The scheduler is invoked either explicitly when a thread calls the
206
      <para>The scheduler is invoked either explicitly when a thread calls the
199
      <code>scheduler</code> function (e.g. goes to sleep or merely wants to
207
      <code>scheduler</code> function (e.g. goes to sleep or merely wants to
200
      relinquish the processor for a while) or implicitly on a periodic basis
208
      relinquish the processor for a while) or implicitly on a periodic basis
201
      when the generic clock interrupt preempts the current thread. After its
209
      when the generic clock interrupt preempts the current thread. After its
202
      invocation, the scheduler saves the synchronous register context of the
210
      invocation, the scheduler saves the synchronous register context of the
203
      current thread and switches to its private stack. Afterwards, a new
211
      current thread and switches to its private stack. Afterwards, a new
204
      thread is selected according to the scheduling policy. If there is no
212
      thread is selected according to the scheduling policy. If there is no
205
      suitable thread, the processor is idle and no thread executes on it.
213
      suitable thread, the processor is idle and no thread executes on it.
206
      Note that the act of switching to the private scheduler stack is
214
      Note that the act of switching to the private scheduler stack is
207
      essential. If the processor kept running using the stack of the
215
      essential. If the processor kept running using the stack of the
208
      preempted thread it could damage it because the old thread can be
216
      preempted thread it could damage it because the old thread can be
209
      migrated to another processor and scheduled there. In the worst case
217
      migrated to another processor and scheduled there. In the worst case
210
      scenario, two execution flows would be using the same stack.</para>
218
      scenario, two execution flows would be using the same stack.</para>
211
 
219
 
212
      <para>The scheduling policy is implemented in function
220
      <para>The scheduling policy is implemented in function
213
      <code>find_best_thread</code>. This function walks the processor run
221
      <code>find_best_thread</code>. This function walks the processor run
214
      queues from lower towards higher indices and looks for a thread. If the
222
      queues from lower towards higher indices and looks for a thread. If the
215
      visited run queue is empty, it simply searches the next run queue. If it
223
      visited run queue is empty, it simply searches the next run queue. If it
216
      is known in advance that there are no ready threads waiting for
224
      is known in advance that there are no ready threads waiting for
217
      execution, <code>find_best_thread</code> interruptibly halts the
225
      execution, <code>find_best_thread</code> interruptibly halts the
218
      processor or busy waits until some threads arrive. This process repeats
226
      processor or busy waits until some threads arrive. This process repeats
219
      until <code>find_best_thread</code> succeeds.</para>
227
      until <code>find_best_thread</code> succeeds.</para>
220
 
228
 
221
      <para>After the best thread is chosen, the scheduler switches to the
229
      <para>After the best thread is chosen, the scheduler switches to the
222
      thread's task and memory management context. Finally, the saved
230
      thread's task and memory management context. Finally, the saved
223
      synchronous register context is restored and the thread runs. Each
231
      synchronous register context is restored and the thread runs. Each
224
      scheduled thread is given a time slice depending on its priority (i.e.
232
      scheduled thread is given a time slice depending on its priority (i.e.
225
      run queue). The higher priority, the shorter timeslice. To summarize,
233
      run queue). The higher priority, the shorter timeslice. To summarize,
226
      this policy schedules threads with high priorities more frequently but
234
      this policy schedules threads with high priorities more frequently but
227
      gives them smaller time slices. On the other hand, lower priority
235
      gives them smaller time slices. On the other hand, lower priority
228
      threads are scheduled less frequently, but run for longer periods of
236
      threads are scheduled less frequently, but run for longer periods of
229
      time.</para>
237
      time.</para>
230
 
238
 
231
      <para>When a thread uses its entire time slice, it is preempted and put
239
      <para>When a thread uses its entire time slice, it is preempted and put
232
      back into the run queue that immediately follows the previous run queue
240
      back into the run queue that immediately follows the previous run queue
233
      from which the thread ran. Threads that are woken up from a sleep are
241
      from which the thread ran. Threads that are woken up from a sleep are
234
      put into the biggest priority run queue. Low priority threads are
242
      put into the biggest priority run queue. Low priority threads are
235
      therefore those that don't go to sleep so often and just occupy the
243
      therefore those that don't go to sleep so often and just occupy the
236
      processor.</para>
244
      processor.</para>
237
 
245
 
238
      <para>In order to avoid complete starvation of the low priority threads,
246
      <para>In order to avoid complete starvation of the low priority threads,
239
      from time to time, the scheduler will provide them with a bonus of one
247
      from time to time, the scheduler will provide them with a bonus of one
240
      point priority increase. In other words, the scheduler will now and then
248
      point priority increase. In other words, the scheduler will now and then
241
      move the entire run queues one level up.</para>
249
      move the entire run queues one level up.</para>
242
    </section>
250
    </section>
243
 
251
 
244
    <section>
252
    <section>
245
      <title>Processor load balancing</title>
253
      <title>Processor load balancing</title>
246
 
254
 
247
      <para>Normally, for the sake of cache locality, threads are scheduled on
255
      <para>Normally, for the sake of cache locality, threads are scheduled on
248
      one of the processors and don't leave it. Nevertheless, a situation in
256
      one of the processors and don't leave it. Nevertheless, a situation in
249
      which one processor is heavily overloaded while others sit idle can
257
      which one processor is heavily overloaded while others sit idle can
250
      occur. HelenOS deploys special kernel threads to help to mitigate this
258
      occur. HelenOS deploys special kernel threads to help to mitigate this
251
      problem. Each processor is associated with one load balancing thread
259
      problem. Each processor is associated with one load balancing thread
252
      called <code>kcpulb</code> that wakes up regularily to see whether its
260
      called <code>kcpulb</code> that wakes up regularily to see whether its
253
      processor is underbalanced or not. If yes, the thread attempts to
261
      processor is underbalanced or not. If yes, the thread attempts to
254
      migrate threads from other overloaded processors to its own processor's
262
      migrate threads from other overloaded processors to its own processor's
255
      run queues. When the job is done or there is no need for load balancing,
263
      run queues. When the job is done or there is no need for load balancing,
256
      the thread goes to sleep.</para>
264
      the thread goes to sleep.</para>
257
 
265
 
258
      <para>The balancing threads operate very gently and try to migrate low
266
      <para>The balancing threads operate very gently and try to migrate low
259
      priority threads first; one <code>kcpulb</code> never takes from one
267
      priority threads first; one <code>kcpulb</code> never takes from one
260
      processor twice in a row. The load balancing threads as well as threads
268
      processor twice in a row. The load balancing threads as well as threads
261
      that were just stolen cannot be migrated. The <code>kcpulb</code>
269
      that were just stolen cannot be migrated. The <code>kcpulb</code>
262
      threads are wired to their processors and cannot be migrated whatsoever.
270
      threads are wired to their processors and cannot be migrated whatsoever.
263
      The ordinary threads are protected only until they are
271
      The ordinary threads are protected only until they are
264
      rescheduled.</para>
272
      rescheduled.</para>
265
    </section>
273
    </section>
266
  </section>
274
  </section>
267
</chapter>
275
</chapter>