Subversion Repositories HelenOS-doc

Rev

Rev 131 | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
52 jermar 1
<?xml version="1.0" encoding="UTF-8"?>
55 jermar 2
<chapter id="time">
3
  <?dbhtml filename="time.html"?>
52 jermar 4
 
131 jermar 5
  <title>Time Management</title>
52 jermar 6
 
139 palkovsky 7
  <para>Time is one of the dimensions in which kernel as well as the whole
8
  system operates. It is of special importance to many kernel subsytems.
55 jermar 9
  Knowledge of time makes it possible for the scheduler to preemptively plan
10
  threads for execution. Different parts of the kernel can request execution
139 palkovsky 11
  of their callback function with a specified delay. A good example of such
55 jermar 12
  kernel code is the synchronization subsystem which uses this functionality
13
  to implement timeouting versions of synchronization primitives.</para>
14
 
15
  <section>
131 jermar 16
    <title>System Clock</title>
53 jermar 17
 
55 jermar 18
    <para>Every hardware architecture supported by HelenOS must support some
19
    kind of a device that can be programmed to yield periodic time signals
20
    (i.e. clock interrupts). Some architectures have external clock that is
21
    merely programmed by the kernel to interrupt the processor multiple times
22
    in a second. This is the case of ia32 and amd64 architectures<footnote>
23
        <para>When running in uniprocessor mode.</para>
24
      </footnote>, which use i8254 or a compatible chip to achieve the
25
    goal.</para>
26
 
27
    <para>Other architectures' processors typically contain two registers. The
28
    first register is usually called a compare or a match register and can be
29
    set to an arbitrary value by the operating system. The contents of the
30
    compare register then stays unaltered until it is written by the kernel
31
    again. The second register, often called a counter register, can be also
32
    written by the kernel, but the processor automatically increments it after
33
    every executed instruction or in some fixed relation to processor speed.
34
    The point is that a clock interrupt is generated whenever the values of
35
    the counter and the compare registers match. Sometimes, the scheme of two
36
    registers is modified so that only one register is needed. Such a
37
    register, called a decrementer, then counts towards zero and an interrupt
38
    is generated when zero is reached.</para>
39
 
40
    <para>In any case, the initial value of the decrementer or the initial
41
    difference between the counter and the compare registers, respectively,
42
    must be set accordingly to a known relation between the real time and the
43
    speed of the decrementer or the counter register, respectively.</para>
44
 
45
    <para>The rest of this section will, for the sake of clarity, focus on the
46
    two-register scheme. The decrementer scheme is very similar.</para>
47
 
58 jermar 48
    <para>The kernel must reinitialize one of the two registers after each
49
    clock interrupt in order to schedule next interrupt. However this step is
50
    tricky and must be done with caution. Imagine that the clock interrupt is
51
    masked either because the kernel is servicing another interrupt or because
52
    the processor locally disabled interrupts for a while. If the clock
139 palkovsky 53
    interrupt occurs during this period, it will be pending until the
54
    interrupts are enabled again. Theoretically, it could happen an arbitrary
55
    counter register ticks later. Which is worse, the ideal time period
56
    between two non-delayed clock interrupts can also elapse arbitrary number
57
    of times before the delayed interrupt gets serviced. The
58
    architecture-specific part of the clock interrupt driver must avoid time
59
    drifts caused by such behaviour by taking proactive
60
    counter-measures.</para>
55 jermar 61
 
62
    <para>Let us assume that the kernel wants each clock interrupt be
63
    generated every <constant>TICKCONST</constant> ticks. This value
64
    represents the ideal number of ticks between two non-delayed clock
65
    interrupts and has some known relation to real time. On each clock
66
    interrupt, the kernel computes and writes down the expected value of the
67
    counter register as it hopes to read it on the next clock interrupt. When
68
    that interrupt comes, the kernel reads the counter register again and
69
    compares it with the written down value. If the difference is smaller than
70
    or equal to <constant>TICKCONST</constant>, then the time drift is none or
71
    small and the next interrupt is scheduled earlier with a penalty of so
72
    many ticks as is the value of the difference. However, if the difference
73
    is bigger, then at least one clock signal was missed. In that case, the
74
    missed clock signal is remembered in the special counter. If there are
75
    more missed signals, each of them is recorded there. The next interrupt is
76
    scheduled with respect to the difference similarily to the former case.
77
    This time, the penalty is taken modulo <constant>TICKCONST</constant>. The
78
    effect of missed clock signals is remedied in the generic clock interrupt
79
    handler.</para>
80
  </section>
81
 
82
  <section>
83
    <title>Timeouts</title>
84
 
85
    <para>Kernel subsystems can register a callback function to be executed
86
    with a specified delay. Such a registration is represented by a kernel
87
    structure called <classname>timeout</classname>. Timeouts are registered
131 jermar 88
    via <code>timeout_register</code> function. This function takes a pointer
55 jermar 89
    to a timeout structure, a callback function, a parameter of the callback
90
    function and a delay in microseconds as parameters. After the structure is
91
    initialized with all these values, it is sorted into the processor's list
56 jermar 92
    of active timeouts, according to the number of clock interrupts remaining
93
    to their expiration and relatively to already listed timeouts.</para>
55 jermar 94
 
131 jermar 95
    <para>Timeouts can be unregistered via <code>timeout_unregister</code>.
96
    This function can, as opposed to <code>timeout_register</code>, fail when
55 jermar 97
    it is too late to remove the timeout from the list of active
98
    timeouts.</para>
99
 
100
    <para>Timeouts are nearing their expiration in the list of active timeouts
101
    which exists on every processor in the system. The expiration counters are
102
    decremented on each clock interrupt by the generic clock interrupt
103
    handler. Due to the relative ordering of timeouts in the list, it is
104
    sufficient to decrement expiration counter only of the first timeout in
105
    the list. Timeouts with expiration counter equal to zero are removed from
106
    the list and their callback function is called with respective
107
    parameter.</para>
108
  </section>
109
 
110
  <section>
131 jermar 111
    <title>Generic Clock Interrupt Handler</title>
55 jermar 112
 
113
    <para>On each clock interrupt, the architecture specific part of the clock
114
    interrupt handler makes a call to the generic clock interrupt handler
131 jermar 115
    implemented by the <code>clock</code> function. The generic handler takes
55 jermar 116
    care of several mission critical goals:</para>
117
 
118
    <itemizedlist>
119
      <listitem>
120
        <para>expiration of timeouts,</para>
121
      </listitem>
122
 
123
      <listitem>
124
        <para>updating time of the day counters for userspace and</para>
125
      </listitem>
126
 
127
      <listitem>
128
        <para>preemption of threads.</para>
129
      </listitem>
130
    </itemizedlist>
131
 
131 jermar 132
    <para>The <code>clock</code> function checks for expired timeouts and
56 jermar 133
    decrements unexpired timeout expiration counters exactly one more times
134
    than is the number of missed clock signals (i.e. at least once and
135
    possibly more times, depending on the missed clock signals counter). The
136
    time of the day counters are also updated one more times than is the
137
    number of missed clock signals. And finally, the remaining timeslice of
138
    the running thread is decremented with respect to this counter as well. By
139
    considering its value, the kernel performs actions that would otherwise be
140
    lost due to an occasional excessive time drift described in previous
141
    paragraphs.</para>
55 jermar 142
  </section>
56 jermar 143
 
144
  <section>
131 jermar 145
    <title>Time Source for Userspace</title>
56 jermar 146
 
147
    <para>In HelenOS, userspace tasks don't communicate with the kernel in
85 palkovsky 148
    order to read the system time. Instead, a mechanism that shares kernel
149
    time of the the day counters with userspace address spaces is deployed. On
150
    the kernel side, during system initialization, HelenOS allocates a frame
151
    of physical memory and stores the time of the day counters there. The
152
    counters have the following structure:</para>
56 jermar 153
 
154
    <itemizedlist>
155
      <listitem>
156
        <para>first 32-bit counter for seconds,</para>
157
      </listitem>
158
 
159
      <listitem>
160
        <para>32-bit counter for microseconds and</para>
161
      </listitem>
162
 
163
      <listitem>
164
        <para>second 32-bit counter for seconds.</para>
165
      </listitem>
166
    </itemizedlist>
167
 
168
    <para>One of the userspace tasks with capabilities of memory manager (e.g.
85 palkovsky 169
    ns) asks the kernel to map this frame into its address space. Other
170
    non-privileged tasks then use IPC to receive read-only share of this
171
    memory. Reading time in a userspace task is therefore just a matter of
172
    reading memory.</para>
56 jermar 173
 
174
    <para>There are two interesting points about this. First, the counters are
175
    32-bit even on 64-bit machines. The goal is to provide subsecond precision
176
    with the possibility to span roughly 136 years. Note that a single 64-bit
177
    microsecond counter could not be usually read atomically on 32-bit
85 palkovsky 178
    platforms. Unfortunately, on 32-bit platforms it is usually impossible to
179
    read atomically two 32-bit counters either. However, a generic protocol is
180
    used to guarantee that sequentially read times will create a
181
    non-decreasing sequence.</para>
56 jermar 182
 
85 palkovsky 183
    <para>The problematic part is incrementing seconds counter and clearing
184
    microseconds counter together once every second. Seconds must be
185
    incremented and microseconds must be reset. However, without any
186
    synchronization, the two kernel stores and the two userspace reads can
187
    arbitrarily interleave. Furthemore, the reader has no chance to detect
188
    that the counters were updated only paritally. Therefore three counters
189
    are used in HelenOS.</para>
56 jermar 190
 
191
    <para>If seconds need to be updated, the kernel increments the first
192
    second counter, issues a write memory barrier operation, updates the
193
    microsecond counter, issues another write memory barrier operation and
85 palkovsky 194
    increments the second second counter. When only microseconds needs to be
56 jermar 195
    updated, no special action is taken by the kernel. On the other hand, the
85 palkovsky 196
    userspace task must always read all three counters in reversed order. A
197
    read memory barrier operation must be issued between each two reads. A
56 jermar 198
    non-atomic read is detected when the two second counters differ. The
85 palkovsky 199
    userspace library solves this situation by returning higher of them with
200
    microseconds set to zero.</para>
56 jermar 201
  </section>
52 jermar 202
</chapter>