Subversion Repositories HelenOS-doc

Rev

Rev 169 | Blame | Compare with Previous | Last modification | View Log | Download | RSS feed

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <chapter id="scheduling">
  3.   <?dbhtml filename="scheduling.html"?>
  4.  
  5.   <title>Scheduling</title>
  6.  
  7.   <para>One of the key aims of the operating system is to create and support
  8.   the impression that several activities are executing contemporarily. This is
  9.   true for both uniprocessor as well as multiprocessor systems. In the case of
  10.   multiprocessor systems, the activities are truly happening in parallel. The
  11.   scheduler helps to materialize this impression by planning threads on as
  12.   many processors as possible and, when this strategy reaches its limits, by
  13.   quickly switching among threads executing on a single processor.</para>
  14.  
  15.   <section>
  16.     <title>Contexts</title>
  17.  
  18.     <para>The term <emphasis>context</emphasis> refers to the set of processor
  19.     resources that define the current state of the computation or the
  20.     environment and the kernel understands it in several more or less narrow
  21.     sences:</para>
  22.  
  23.     <itemizedlist>
  24.       <listitem>
  25.         <para>synchronous register context,</para>
  26.       </listitem>
  27.  
  28.       <listitem>
  29.         <para>asynchronous register context,</para>
  30.       </listitem>
  31.  
  32.       <listitem>
  33.         <para>FPU context and</para>
  34.       </listitem>
  35.  
  36.       <listitem>
  37.         <para>memory management context.</para>
  38.       </listitem>
  39.     </itemizedlist>
  40.  
  41.     <para>The most narrow sense refers to the the synchronous register
  42.     context. It includes all the preserved registers as defined by the
  43.     architecture. To highlight some, the program counter and stack pointer
  44.     take part in the synchronous register context. These registers must be
  45.     preserved across a procedure call and during synchronous context
  46.     switches.</para>
  47.  
  48.     <para>The next type of the context understood by the kernel is the
  49.     asynchronous register context. On an interrupt, the interrupted execution
  50.     flow's state must be guaranteed to be eventually completely restored.
  51.    Therefore the interrupt context includes, among other things, the scratch
  52.    registers as defined by the architecture. As a special optimization and if
  53.    certain conditions are met, it need not include the architecture's
  54.     preserved registers. The condition mentioned in the previous sentence is
  55.     that the low-level assembly language interrupt routines don't modify the
  56.    preserved registers. The handlers usually call a higher-level C routine.
  57.    The preserved registers are then saved on the stack by the compiler
  58.    generated code of the higher-level function. In HelenOS, several
  59.    architectures can be compiled with this optimization.</para>
  60.  
  61.    <para>Although the kernel does not do any floating point
  62.    arithmetics<footnote>
  63.        <para>Some architectures (e.g. ia64) inevitably use a fixed set of
  64.        floating point registers to carry out their normal operations.</para>
  65.      </footnote>, it must protect FPU context of userspace threads against
  66.    destruction by other threads. Moreover, only a fraction of userspace
  67.    programs use the floating point unit. HelenOS contains a generic framework
  68.    for switching FPU context only when the switch is forced (i.e. a thread
  69.    uses a floating point instruction and its FPU context is not loaded in the
  70.    processor).</para>
  71.  
  72.    <para>The last member of the context family is the memory management
  73.    context. It includes memory management registers that identify address
  74.    spaces on hardware level (i.e. ASIDs and page tables pointers).</para>
  75.  
  76.    <section>
  77.      <title>Synchronous Context Switches</title>
  78.  
  79.      <para>The scheduler, but also other pieces of the kernel, make heavy use
  80.      of synchronous context switches, because it is a natural vehicle not
  81.      only for changes in control flow, but also for switching between two
  82.      kernel stacks. Two functions figure in a synchronous context switch
  83.      implementation: <code>context_save()</code> and
  84.      <code>context_restore()</code>. Note that these two functions break the
  85.      natural perception of the linear C code execution flow starting at
  86.      function's entry point and ending on one of the function's exit
  87.      points.</para>
  88.  
  89.      <para>When the <code>context_save()</code> function is called, the
  90.      synchronous context is saved in a memory structure passed to it. After
  91.      executing <code>context_save()</code>, the caller is returned 1 as a
  92.      return value. The execution of instructions continues as normally until
  93.      <code>context_restore()</code> is called. For the caller, it seems like
  94.      the call never returns<footnote>
  95.          <para>Which might be a source of problems with variable liveliness
  96.          after <code>context_restore()</code>.</para>
  97.        </footnote>. Nevertheless, a synchronous register context, which is
  98.      saved in a memory structure passed to <code>context_restore()</code>, is
  99.      restored, thus transfering the control flow to the place of occurrence
  100.      of the corresponding call to <code>context_save()</code>. From the
  101.      perspective of the caller of the corresponding
  102.      <code>context_save()</code>, it looks like a return from
  103.      <code>context_save()</code>. However, this time a return value of 0 is
  104.      returned.</para>
  105.    </section>
  106.  </section>
  107.  
  108.  <section>
  109.    <title>Threads</title>
  110.  
  111.    <para>A thread is the basic executable entity with some code and a stack.
  112.    While the code, implemented by a C language function, can be shared by
  113.    several threads, the stack is always private to each instance of the
  114.    thread. Each thread belongs to exactly one task through which it shares
  115.    address space with its sibling threads. Threads that execute purely in the
  116.    kernel don't have any userspace memory allocated. However, when a thread
  117.     has ambitions to run in userspace, it must be allocated a userspace stack.
  118.     The distinction between the purely kernel threads and threads running also
  119.     in userspace is made by refering to the former group as to kernel threads
  120.     and to the latter group as to userspace threads. Both kernel and userspace
  121.     threads are visible to the scheduler and can become a subject of kernel
  122.     preemption and thread migration anytime when preemption is not
  123.     disabled.</para>
  124.  
  125.     <formalpara>
  126.       <title>Thread States</title>
  127.  
  128.       <para>In each moment, a thread exists in one of six possible thread
  129.       states. When the thread is created and first inserted into the
  130.       scheduler's run queues or when a thread is migrated to a new processor,
  131.      it is put into the <constant>Entering</constant> state. After some time
  132.      elapses, the scheduler picks up the thread and starts executing it. A
  133.      thread being currently executed on a processor is in the
  134.      <constant>Running</constant> state. From there, the thread has three
  135.      possibilities. It either runs until it is preemtped, in which case the
  136.      state changes to <constant>Ready</constant>, goes to the
  137.      <constant>Sleeping</constant> state by going to sleep or enters the
  138.      <constant>Exiting</constant> state when it reaches termination. When the
  139.      thread exits, its kernel structure usually stays in memory, until the
  140.      thread is detached by another thread using <code>thread_detach()</code>
  141.      function. Terminated but undetached threads are in the
  142.      <constant>Lingering</constant> state. When the thread is detached or
  143.      detaches itself during its life, it is destroyed in the
  144.      <constant>Exiting</constant> state and the
  145.      <constant>Lingering</constant> state is not reached.<figure float="1">
  146.          <title>Transitions among thread states.</title>
  147.  
  148.          <mediaobject id="thread_states" xreflabel="">
  149.            <imageobject role="pdf">
  150.              <imagedata fileref="images/thread_states.pdf" format="PDF" />
  151.            </imageobject>
  152.  
  153.            <imageobject role="html">
  154.              <imagedata fileref="images/thread_states.png" format="PNG" />
  155.            </imageobject>
  156.  
  157.            <imageobject role="fop">
  158.              <imagedata fileref="images/thread_states.svg" format="SVG" />
  159.            </imageobject>
  160.          </mediaobject>
  161.        </figure></para>
  162.    </formalpara>
  163.  
  164.    <formalpara>
  165.      <title>Fibrils</title>
  166.  
  167.      <para>HelenOS userspace layer knows even smaller units of execution.
  168.      Each userspace thread can make use of an arbitrary number of fibrils.
  169.      These fibrils have their own synchronous register context, userspace
  170.      code and stack. They live their own life within the userspace thread and
  171.      the scheduler does not have any idea about them because they are
  172.      completely implemented by the userspace library. This implies several
  173.      things:<itemizedlist>
  174.          <listitem>
  175.            <para>fibrils schedule themselves cooperatively within the time
  176.            slice given to their userspace thread,</para>
  177.          </listitem>
  178.  
  179.          <listitem>
  180.            <para>fibrils share FPU context of their containing thread
  181.            and</para>
  182.          </listitem>
  183.  
  184.          <listitem>
  185.            <para>all fibrils of one userspace thread block when one of them
  186.            goes to sleep.</para>
  187.          </listitem>
  188.        </itemizedlist></para>
  189.    </formalpara>
  190.  </section>
  191.  
  192.  <section>
  193.    <title>Scheduler</title>
  194.  
  195.    <section>
  196.      <title>Run Queues</title>
  197.  
  198.      <para>There is an array of several run queues on each processor. The
  199.      current version of HelenOS uses 16 run queues implemented by 16 doubly
  200.      linked lists. Each of the run queues is associated with thread priority.
  201.      The lower the run queue index in the array is, the higher is the
  202.      priority of threads linked in that run queue and the shorter is the time
  203.      in which those threads will execute. When kernel code wants to access
  204.      the run queue, it must first acquire its lock.</para>
  205.    </section>
  206.  
  207.    <section>
  208.      <title>Scheduler Operation</title>
  209.  
  210.      <para>The scheduler is invoked either explicitly when a thread calls the
  211.      <code>scheduler()</code> function (e.g. goes to sleep or merely wants to
  212.      relinquish the processor for a while) or implicitly on a periodic basis
  213.      when the generic clock interrupt preempts the current thread. After its
  214.      invocation, the scheduler saves the synchronous register context of the
  215.      current thread and switches to its private stack. Afterwards, a new
  216.      thread is selected according to the scheduling policy. If there is no
  217.      suitable thread, the processor is idle and no thread executes on it.
  218.      Note that the act of switching to the private scheduler stack is
  219.      essential. If the processor kept running using the stack of the
  220.      preempted thread it could damage it because the old thread can be
  221.      migrated to another processor and scheduled there. In the worst case
  222.      scenario, two execution flows would be using the same stack.</para>
  223.  
  224.      <para>The scheduling policy is implemented in the function
  225.      <code>find_best_thread()</code>. This function walks the processor run
  226.      queues from lower towards higher indices and looks for a thread. If the
  227.      visited run queue is empty, it simply searches the next run queue. If it
  228.      is known in advance that there are no ready threads waiting for
  229.      execution, <code>find_best_thread()</code> interruptibly halts the
  230.      processor or busy waits until some threads arrive. This process repeats
  231.      until <code>find_best_thread()</code> succeeds.</para>
  232.  
  233.      <para>After the best thread is chosen, the scheduler switches to the
  234.      thread's task and memory management context. Finally, the saved
  235.       synchronous register context is restored and the thread runs. Each
  236.       scheduled thread is given a time slice depending on its priority (i.e.
  237.       run queue). The higher priority, the shorter timeslice. To summarize,
  238.       this policy schedules threads with high priorities more frequently but
  239.       gives them smaller time slices. On the other hand, lower priority
  240.       threads are scheduled less frequently, but run for longer periods of
  241.       time.</para>
  242.  
  243.       <para>When a thread uses its entire time slice, it is preempted and put
  244.       back into the run queue that immediately follows the previous run queue
  245.       from which the thread ran. Threads that are woken up from a sleep are
  246.       put into the biggest priority run queue. Low priority threads are
  247.       therefore those that don't go to sleep so often and just occupy the
  248.      processor.</para>
  249.  
  250.      <para>In order to avoid complete starvation of the low priority threads,
  251.      from time to time, the scheduler will provide them with a bonus of one
  252.      point priority increase. In other words, the scheduler will now and then
  253.      move the entire run queues one level up.</para>
  254.    </section>
  255.  
  256.    <section>
  257.      <title>Processor Load Balancing</title>
  258.  
  259.      <para>Normally, for the sake of cache locality, threads are scheduled on
  260.      one of the processors and don't leave it. Nevertheless, a situation in
  261.       which one processor is heavily overloaded while others sit idle can
  262.       occur. HelenOS deploys special kernel threads to help mitigate this
  263.       problem. Each processor is associated with one load balancing thread
  264.       called <code>kcpulb</code> that wakes up regularly to see whether its
  265.       processor is underbalanced or not. If it is, the thread attempts to
  266.       migrate threads from other overloaded processors to its own processor's
  267.      run queues. When the job is done or there is no need for load balancing,
  268.      the thread goes to sleep.</para>
  269.  
  270.      <para>The balancing threads operate very gently and try to migrate low
  271.      priority threads first; one <code>kcpulb</code> never takes from one
  272.      processor twice in a row. The load balancing threads as well as threads
  273.      that were just stolen cannot be migrated. The <code>kcpulb</code>
  274.      threads are wired to their processors and cannot be migrated whatsoever.
  275.      The ordinary threads are protected only until they are
  276.      rescheduled.</para>
  277.    </section>
  278.  </section>
  279. </chapter>