Subversion Repositories HelenOS-doc

Rev

Rev 55 | Rev 85 | Go to most recent revision | Only display areas with differences | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed

Rev 55 Rev 56
1
<?xml version="1.0" encoding="UTF-8"?>
1
<?xml version="1.0" encoding="UTF-8"?>
2
<chapter id="time">
2
<chapter id="time">
3
  <?dbhtml filename="time.html"?>
3
  <?dbhtml filename="time.html"?>
4
 
4
 
5
  <title>Time management</title>
5
  <title>Time management</title>
6
 
6
 
7
  <para>Time is one of the dimensions in which kernel, as well as the whole
7
  <para>Time is one of the dimensions in which kernel, as well as the whole
8
  system, operates. It is of special importance to many kernel subsytems.
8
  system, operates. It is of special importance to many kernel subsytems.
9
  Knowledge of time makes it possible for the scheduler to preemptively plan
9
  Knowledge of time makes it possible for the scheduler to preemptively plan
10
  threads for execution. Different parts of the kernel can request execution
10
  threads for execution. Different parts of the kernel can request execution
11
  of their callback function with some specified delay. A good example of such
11
  of their callback function with some specified delay. A good example of such
12
  kernel code is the synchronization subsystem which uses this functionality
12
  kernel code is the synchronization subsystem which uses this functionality
13
  to implement timeouting versions of synchronization primitives.</para>
13
  to implement timeouting versions of synchronization primitives.</para>
14
 
14
 
15
  <section>
15
  <section>
16
    <title>System clock</title>
16
    <title>System clock</title>
17
 
17
 
18
    <para>Every hardware architecture supported by HelenOS must support some
18
    <para>Every hardware architecture supported by HelenOS must support some
19
    kind of a device that can be programmed to yield periodic time signals
19
    kind of a device that can be programmed to yield periodic time signals
20
    (i.e. clock interrupts). Some architectures have external clock that is
20
    (i.e. clock interrupts). Some architectures have external clock that is
21
    merely programmed by the kernel to interrupt the processor multiple times
21
    merely programmed by the kernel to interrupt the processor multiple times
22
    in a second. This is the case of ia32 and amd64 architectures<footnote>
22
    in a second. This is the case of ia32 and amd64 architectures<footnote>
23
        <para>When running in uniprocessor mode.</para>
23
        <para>When running in uniprocessor mode.</para>
24
      </footnote>, which use i8254 or a compatible chip to achieve the
24
      </footnote>, which use i8254 or a compatible chip to achieve the
25
    goal.</para>
25
    goal.</para>
26
 
26
 
27
    <para>Other architectures' processors typically contain two registers. The
27
    <para>Other architectures' processors typically contain two registers. The
28
    first register is usually called a compare or a match register and can be
28
    first register is usually called a compare or a match register and can be
29
    set to an arbitrary value by the operating system. The contents of the
29
    set to an arbitrary value by the operating system. The contents of the
30
    compare register then stays unaltered until it is written by the kernel
30
    compare register then stays unaltered until it is written by the kernel
31
    again. The second register, often called a counter register, can be also
31
    again. The second register, often called a counter register, can be also
32
    written by the kernel, but the processor automatically increments it after
32
    written by the kernel, but the processor automatically increments it after
33
    every executed instruction or in some fixed relation to processor speed.
33
    every executed instruction or in some fixed relation to processor speed.
34
    The point is that a clock interrupt is generated whenever the values of
34
    The point is that a clock interrupt is generated whenever the values of
35
    the counter and the compare registers match. Sometimes, the scheme of two
35
    the counter and the compare registers match. Sometimes, the scheme of two
36
    registers is modified so that only one register is needed. Such a
36
    registers is modified so that only one register is needed. Such a
37
    register, called a decrementer, then counts towards zero and an interrupt
37
    register, called a decrementer, then counts towards zero and an interrupt
38
    is generated when zero is reached.</para>
38
    is generated when zero is reached.</para>
39
 
39
 
40
    <para>In any case, the initial value of the decrementer or the initial
40
    <para>In any case, the initial value of the decrementer or the initial
41
    difference between the counter and the compare registers, respectively,
41
    difference between the counter and the compare registers, respectively,
42
    must be set accordingly to a known relation between the real time and the
42
    must be set accordingly to a known relation between the real time and the
43
    speed of the decrementer or the counter register, respectively.</para>
43
    speed of the decrementer or the counter register, respectively.</para>
44
 
44
 
45
    <para>The rest of this section will, for the sake of clarity, focus on the
45
    <para>The rest of this section will, for the sake of clarity, focus on the
46
    two-register scheme. The decrementer scheme is very similar.</para>
46
    two-register scheme. The decrementer scheme is very similar.</para>
47
 
47
 
48
    <para>The kernel must reinitialize the counter registers after each clock
48
    <para>The kernel must reinitialize the counter registers after each clock
49
    interrupt in order to schedule next interrupt. However this step is tricky
49
    interrupt in order to schedule next interrupt. However this step is tricky
50
    and must be done with caution. Imagine that the clock interrupt is masked
50
    and must be done with caution. Imagine that the clock interrupt is masked
51
    either because the kernel is servicing another interrupt or because the
51
    either because the kernel is servicing another interrupt or because the
52
    processor locally disabled interrupts for a while. If the clock interrupt
52
    processor locally disabled interrupts for a while. If the clock interrupt
53
    occurs during this period, it will be pending until interrupts are enabled
53
    occurs during this period, it will be pending until interrupts are enabled
54
    again. In theory, that could happen arbitrary counter register ticks
54
    again. In theory, that could happen arbitrary counter register ticks
55
    later. Which is worse, the ideal time period between two non-delayed clock
55
    later. Which is worse, the ideal time period between two non-delayed clock
56
    interrupts can also elapse arbitrary number of times before the delayed
56
    interrupts can also elapse arbitrary number of times before the delayed
57
    interrupt gets serviced. The architecture-specific part of the clock
57
    interrupt gets serviced. The architecture-specific part of the clock
58
    interrupt driver must avoid time drifts caused by this by taking proactive
58
    interrupt driver must avoid time drifts caused by this by taking proactive
59
    counter-measures.</para>
59
    counter-measures.</para>
60
 
60
 
61
    <para>Let us assume that the kernel wants each clock interrupt be
61
    <para>Let us assume that the kernel wants each clock interrupt be
62
    generated every <constant>TICKCONST</constant> ticks. This value
62
    generated every <constant>TICKCONST</constant> ticks. This value
63
    represents the ideal number of ticks between two non-delayed clock
63
    represents the ideal number of ticks between two non-delayed clock
64
    interrupts and has some known relation to real time. On each clock
64
    interrupts and has some known relation to real time. On each clock
65
    interrupt, the kernel computes and writes down the expected value of the
65
    interrupt, the kernel computes and writes down the expected value of the
66
    counter register as it hopes to read it on the next clock interrupt. When
66
    counter register as it hopes to read it on the next clock interrupt. When
67
    that interrupt comes, the kernel reads the counter register again and
67
    that interrupt comes, the kernel reads the counter register again and
68
    compares it with the written down value. If the difference is smaller than
68
    compares it with the written down value. If the difference is smaller than
69
    or equal to <constant>TICKCONST</constant>, then the time drift is none or
69
    or equal to <constant>TICKCONST</constant>, then the time drift is none or
70
    small and the next interrupt is scheduled earlier with a penalty of so
70
    small and the next interrupt is scheduled earlier with a penalty of so
71
    many ticks as is the value of the difference. However, if the difference
71
    many ticks as is the value of the difference. However, if the difference
72
    is bigger, then at least one clock signal was missed. In that case, the
72
    is bigger, then at least one clock signal was missed. In that case, the
73
    missed clock signal is remembered in the special counter. If there are
73
    missed clock signal is remembered in the special counter. If there are
74
    more missed signals, each of them is recorded there. The next interrupt is
74
    more missed signals, each of them is recorded there. The next interrupt is
75
    scheduled with respect to the difference similarily to the former case.
75
    scheduled with respect to the difference similarily to the former case.
76
    This time, the penalty is taken modulo <constant>TICKCONST</constant>. The
76
    This time, the penalty is taken modulo <constant>TICKCONST</constant>. The
77
    effect of missed clock signals is remedied in the generic clock interrupt
77
    effect of missed clock signals is remedied in the generic clock interrupt
78
    handler.</para>
78
    handler.</para>
79
  </section>
79
  </section>
80
 
80
 
81
  <section>
81
  <section>
82
    <title>Timeouts</title>
82
    <title>Timeouts</title>
83
 
83
 
84
    <para>Kernel subsystems can register a callback function to be executed
84
    <para>Kernel subsystems can register a callback function to be executed
85
    with a specified delay. Such a registration is represented by a kernel
85
    with a specified delay. Such a registration is represented by a kernel
86
    structure called <classname>timeout</classname>. Timeouts are registered
86
    structure called <classname>timeout</classname>. Timeouts are registered
87
    via <code>timeout_register</code> function. This function takes a pointer
87
    via <code>timeout_register</code> function. This function takes a pointer
88
    to a timeout structure, a callback function, a parameter of the callback
88
    to a timeout structure, a callback function, a parameter of the callback
89
    function and a delay in microseconds as parameters. After the structure is
89
    function and a delay in microseconds as parameters. After the structure is
90
    initialized with all these values, it is sorted into the processor's list
90
    initialized with all these values, it is sorted into the processor's list
91
    of active timeouts.Timeouts are sorted in this list according to the
91
    of active timeouts, according to the number of clock interrupts remaining
92
    number of clock interrupts remaining to their expiration and relative to
92
    to their expiration and relatively to already listed timeouts.</para>
93
    each other. </para>
-
 
94
 
93
 
95
    <para>Timeouts can be unregistered via <code>timeout_unregister</code>.
94
    <para>Timeouts can be unregistered via <code>timeout_unregister</code>.
96
    This function can, as opposed to <code>timeout_register</code>, fail when
95
    This function can, as opposed to <code>timeout_register</code>, fail when
97
    it is too late to remove the timeout from the list of active
96
    it is too late to remove the timeout from the list of active
98
    timeouts.</para>
97
    timeouts.</para>
99
 
98
 
100
    <para>Timeouts are nearing their expiration in the list of active timeouts
99
    <para>Timeouts are nearing their expiration in the list of active timeouts
101
    which exists on every processor in the system. The expiration counters are
100
    which exists on every processor in the system. The expiration counters are
102
    decremented on each clock interrupt by the generic clock interrupt
101
    decremented on each clock interrupt by the generic clock interrupt
103
    handler. Due to the relative ordering of timeouts in the list, it is
102
    handler. Due to the relative ordering of timeouts in the list, it is
104
    sufficient to decrement expiration counter only of the first timeout in
103
    sufficient to decrement expiration counter only of the first timeout in
105
    the list. Timeouts with expiration counter equal to zero are removed from
104
    the list. Timeouts with expiration counter equal to zero are removed from
106
    the list and their callback function is called with respective
105
    the list and their callback function is called with respective
107
    parameter.</para>
106
    parameter.</para>
108
  </section>
107
  </section>
109
 
108
 
110
  <section>
109
  <section>
111
    <title>Generic clock interrupt handler</title>
110
    <title>Generic clock interrupt handler</title>
112
 
111
 
113
    <para>On each clock interrupt, the architecture specific part of the clock
112
    <para>On each clock interrupt, the architecture specific part of the clock
114
    interrupt handler makes a call to the generic clock interrupt handler
113
    interrupt handler makes a call to the generic clock interrupt handler
115
    implemented by the <code>clock</code> function. The generic handler takes
114
    implemented by the <code>clock</code> function. The generic handler takes
116
    care of several mission critical goals:</para>
115
    care of several mission critical goals:</para>
117
 
116
 
118
    <itemizedlist>
117
    <itemizedlist>
119
      <listitem>
118
      <listitem>
120
        <para>expiration of timeouts,</para>
119
        <para>expiration of timeouts,</para>
121
      </listitem>
120
      </listitem>
122
 
121
 
123
      <listitem>
122
      <listitem>
124
        <para>updating time of the day counters for userspace and</para>
123
        <para>updating time of the day counters for userspace and</para>
125
      </listitem>
124
      </listitem>
126
 
125
 
127
      <listitem>
126
      <listitem>
128
        <para>preemption of threads.</para>
127
        <para>preemption of threads.</para>
129
      </listitem>
128
      </listitem>
130
    </itemizedlist>
129
    </itemizedlist>
131
 
130
 
132
    <para>The first two goals are performed exactly one more times than is the
131
    <para>The <code>clock</code> function checks for expired timeouts and
-
 
132
    decrements unexpired timeout expiration counters exactly one more times
133
    number of missed clock signals (i.e. at least once and possibly more
133
    than is the number of missed clock signals (i.e. at least once and
134
    times, depending on the missed clock signals counter). The remaining
134
    possibly more times, depending on the missed clock signals counter). The
-
 
135
    time of the day counters are also updated one more times than is the
-
 
136
    number of missed clock signals. And finally, the remaining timeslice of
135
    timeslice of the running thread is decremented also with respect to this
137
    the running thread is decremented with respect to this counter as well. By
136
    counter. By considering its value, the kernel performs actions that would
138
    considering its value, the kernel performs actions that would otherwise be
137
    otherwise be lost due to an occasional excessive time drift described in
139
    lost due to an occasional excessive time drift described in previous
138
    previous paragraphs.</para>
140
    paragraphs.</para>
-
 
141
  </section>
-
 
142
 
-
 
143
  <section>
-
 
144
    <title>Time source for userspace</title>
-
 
145
 
-
 
146
    <para>In HelenOS, userspace tasks don't communicate with the kernel in
-
 
147
    order to read system time. Instead, a mechanism that shares kernel time of
-
 
148
    the day counters with userspace address spaces is deployed. On the kernel
-
 
149
    side, during system initialization, HelenOS allocates a frame of physical
-
 
150
    memory and stores the time of the day counters there. The counters have
-
 
151
    the following structure:</para>
-
 
152
 
-
 
153
    <itemizedlist>
-
 
154
      <listitem>
-
 
155
        <para>first 32-bit counter for seconds,</para>
-
 
156
      </listitem>
-
 
157
 
-
 
158
      <listitem>
-
 
159
        <para>32-bit counter for microseconds and</para>
-
 
160
      </listitem>
-
 
161
 
-
 
162
      <listitem>
-
 
163
        <para>second 32-bit counter for seconds.</para>
-
 
164
      </listitem>
-
 
165
    </itemizedlist>
-
 
166
 
-
 
167
    <para>One of the userspace tasks with capabilities of memory manager (e.g.
-
 
168
    ns) asks the kernel to map this frame into its address space. Other,
-
 
169
    non-privileged, tasks then use IPC to communicate read-only sharing of
-
 
170
    this memory. Reading time in a userspace task is therefore just a matter
-
 
171
    of reading memory.</para>
-
 
172
 
-
 
173
    <para>There are two interesting points about this. First, the counters are
-
 
174
    32-bit even on 64-bit machines. The goal is to provide subsecond precision
-
 
175
    with the possibility to span roughly 136 years. Note that a single 64-bit
-
 
176
    microsecond counter could not be usually read atomically on 32-bit
-
 
177
    platforms. Now the second point is that 32-bit platforms cannot atomically
-
 
178
    read two 32-bit counters either. However, a generic protocol is used to
-
 
179
    guarantee that sequentially read times will create a non-decreasing
-
 
180
    sequence.</para>
-
 
181
 
-
 
182
    <para>The problematic part is updating both seconds and microseconds once
-
 
183
    in a second. Seconds must be incremented and microseconds must be reset.
-
 
184
    However, without any synchronization, the two kernel stores and the two
-
 
185
    userspace reads can arbitrarily interleave. Furthemore, the reader has no
-
 
186
    chance to detect that the counters were updated only from half. Therefore
-
 
187
    three counters are used in HelenOS.</para>
-
 
188
 
-
 
189
    <para>If seconds need to be updated, the kernel increments the first
-
 
190
    second counter, issues a write memory barrier operation, updates the
-
 
191
    microsecond counter, issues another write memory barrier operation and
-
 
192
    increments the second second counter. When only microseconds need to be
-
 
193
    updated, no special action is taken by the kernel. On the other hand, the
-
 
194
    userspace task must always read all three counters and in reversed order.
-
 
195
    A read memory barrier operation must be issued between each two reads. A
-
 
196
    non-atomic read is detected when the two second counters differ. The
-
 
197
    userspace library solves this situation by returning zero instead of the
-
 
198
    value read from the microsecond counter.</para>
139
  </section>
199
  </section>
140
</chapter>
200
</chapter>