Subversion Repositories HelenOS-doc

Rev

Rev 114 | Rev 119 | Go to most recent revision | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
9 bondari 1
<?xml version="1.0" encoding="UTF-8"?>
85 palkovsky 2
<chapter id="ipc">
3
  <?dbhtml filename="ipc.html"?>
9 bondari 4
 
85 palkovsky 5
  <title>IPC</title>
9 bondari 6
 
85 palkovsky 7
  <para>Due to the high intertask communication traffic, IPC becomes critical
8
  subsystem for microkernels, putting high demands on the speed, latency and
9
  reliability of IPC model and implementation. Although theoretically the use
10
  of asynchronous messaging system looks promising, it is not often
11
  implemented because of a problematic implementation of end user
117 palkovsky 12
  applications. HelenOS implements fully asynchronous messaging system with a
13
  special layer providing a user application developer a reasonably
85 palkovsky 14
  synchronous multithreaded environment sufficient to develop complex
15
  protocols.</para>
38 bondari 16
 
85 palkovsky 17
  <section>
18
    <title>Services provided by kernel</title>
9 bondari 19
 
85 palkovsky 20
    <para>Every message consists of 4 numeric arguments (32-bit and 64-bit on
21
    the corresponding platforms), from which the first one is considered a
22
    method number on message receipt and a return value on answer receipt. The
23
    received message contains identification of the incoming connection, so
99 palkovsky 24
    that the receiving application can distinguish the messages between
25
    different senders. Internally the message contains pointer to the
26
    originating task and to the source of the communication channel. If the
27
    message is forwarded, the originating task identifies the recipient of the
28
    answer, the source channel identifies the connection in case of a hangup
29
    response.</para>
85 palkovsky 30
 
31
    <para>Every message must be eventually answered. The system keeps track of
32
    all messages, so that it can answer them with appropriate error code
33
    should one of the connection parties fail unexpectedly. To limit buffering
99 palkovsky 34
    of the messages in the kernel, every process is has a limited account of
35
    asynchronous messages it can send simultanously. If the limit is reached,
36
    the kernel refuses to send any other message, until some active message is
37
    answered.</para>
85 palkovsky 38
 
99 palkovsky 39
    <para>To facilitate kernel-to-user communication, the IPC subsystem
40
    provides notification messages. The applications can subscribe to a
41
    notification channel and receive messages directed to this channel. Such
42
    messages can be freely sent even from interrupt context as they are
43
    primarily destined to deliver IRQ events to userspace device drivers.
44
    These messages need not be answered, there is no party that could receive
45
    such response.</para>
46
 
85 palkovsky 47
    <section>
48
      <title>Low level IPC</title>
49
 
50
      <para>The whole IPC subsystem consists of one-way communication
51
      channels. Each task has one associated message queue (answerbox). The
112 palkovsky 52
      task can call other tasks and connect it's phones to their answerboxes.,
53
      send and forward messages through these connections and answer received
85 palkovsky 54
      messages. Every sent message is identified by a unique number, so that
55
      the response can be later matched against it. The message is sent over
56
      the phone to the target answerbox. Server application periodically
57
      checks the answerbox and pulls messages from several queues associated
58
      with it. After completing the requested action, server sends a reply
99 palkovsky 59
      back to the answerbox of the originating task. If a need arises, it is
60
      possible to <emphasis>forward</emphasis> a recevied message throught any
61
      of the open phones to another task. This mechanism is used e.g. for
62
      opening new connections.</para>
85 palkovsky 63
 
112 palkovsky 64
      <para>The answerbox contains four different message queues:</para>
65
 
66
      <itemizedlist>
67
        <listitem>
68
          <para>Incoming call queue</para>
69
        </listitem>
70
 
71
        <listitem>
72
          <para>Dispatched call queue</para>
73
        </listitem>
74
 
75
        <listitem>
76
          <para>Answer queue</para>
77
        </listitem>
78
 
79
        <listitem>
80
          <para>Notification queue</para>
81
        </listitem>
82
      </itemizedlist>
83
 
114 bondari 84
      <figure float="1">
85
        <mediaobject id="ipc1">
86
          <imageobject role="pdf">
87
            <imagedata fileref="images/ipc1.pdf" format="PDF" />
88
          </imageobject>
89
 
90
          <imageobject role="html">
91
            <imagedata fileref="images/ipc1.png" format="PNG" />
92
          </imageobject>
93
 
94
          <imageobject role="fop">
95
            <imagedata fileref="images/ipc1.svg" format="SVG" />
96
          </imageobject>
97
        </mediaobject>
98
 
99
        <title>Low level IPC</title>
100
      </figure>
101
 
112 palkovsky 102
      <para>The communication between task A, that is connected to task B
103
      looks as follows: Task A sends a message over it's phone to the target
104
      asnwerbox. The message is saved in task B incoming call queue. When task
105
      B fetches the message for processing, it is automatically moved into the
106
      dispatched call queue. After the server decides to answer the message,
107
      it is removed from dispatched queue and the result is moved into the
108
      answer queue of task A.</para>
109
 
99 palkovsky 110
      <para>The arguments contained in the message are completely arbitrary
111
      and decided by the user. The low level part of kernel IPC fills in
112
      appropriate error codes if there is an error during communication. It is
112 palkovsky 113
      assured that the applications are correctly notified about communication
114
      state. If a program closes the outgoing connection, the target answerbox
115
      receives a hangup message. The connection identification is not reused,
116
      until the hangup message is acknowledged and all other pending messages
117
      are answered.</para>
99 palkovsky 118
 
112 palkovsky 119
      <para>Closing an incoming connection is done by responding to any
120
      incoming message with an EHANGUP error code. The connection is then
121
      immediately closed. The client connection identification (phone id) is
122
      not reused, until the client issues closes it's own side of the
123
      connection ("hangs his phone up").</para>
99 palkovsky 124
 
112 palkovsky 125
      <para>When a task dies (whether voluntarily or by being killed), cleanup
114 bondari 126
      process is started.</para>
99 palkovsky 127
 
128
      <orderedlist>
129
        <listitem>
130
          <para>Hangs up all outgoing connections and sends hangup messages to
131
          all target answerboxes.</para>
132
        </listitem>
133
 
134
        <listitem>
135
          <para>Disconnects all incoming connections.</para>
136
        </listitem>
137
 
138
        <listitem>
139
          <para>Disconnects from all notification channels.</para>
140
        </listitem>
141
 
142
        <listitem>
143
          <para>Answers all unanswered messages from answerbox queues with
144
          appropriate error code.</para>
145
        </listitem>
146
 
147
        <listitem>
148
          <para>Waits until all outgoing messages are answered and all
149
          remaining answerbox queues are empty.</para>
150
        </listitem>
151
      </orderedlist>
85 palkovsky 152
    </section>
153
 
154
    <section>
99 palkovsky 155
      <title>System call IPC layer</title>
85 palkovsky 156
 
157
      <para>On top of this simple protocol the kernel provides special
99 palkovsky 158
      services closely related to the inter-process communication. A range of
159
      method numbers is allocated and protocol is defined for these functions.
160
      The messages are interpreted by the kernel layer and appropriate actions
114 bondari 161
      are taken depending on the parameters of message and answer.</para>
99 palkovsky 162
 
163
      <para>The kernel provides the following services:</para>
164
 
165
      <itemizedlist>
166
        <listitem>
167
          <para>Creating new outgoing connection</para>
168
        </listitem>
169
 
170
        <listitem>
171
          <para>Creating a callback connection</para>
172
        </listitem>
173
 
174
        <listitem>
175
          <para>Sending an address space area</para>
176
        </listitem>
177
 
178
        <listitem>
179
          <para>Asking for an address space area</para>
180
        </listitem>
181
      </itemizedlist>
112 palkovsky 182
 
183
      <para>On startup every task is automatically connected to a
184
      <emphasis>name service task</emphasis>, which provides a switchboard
185
      functionality. To open a new outgoing connection, the client sends a
186
      <constant>CONNECT_ME_TO</constant> message using any of his phones. If
187
      the recepient of this message answers with an accepting answer, a new
188
      connection is created. In itself, this mechanism would allow only
189
      duplicating existing connection. However, if the message is forwarded,
114 bondari 190
      the new connection is made to the final recipient.</para>
112 palkovsky 191
 
192
      <para>On startup every task is automatically connect to the name service
193
      task, which acts as a switchboard and forwards requests for connection
194
      to specific services. To be able to forward a message it must have a
195
      phone connected to the service tasks. The task creates this connection
196
      using a <constant>CONNECT_TO_ME</constant> message which creates a
197
      callback connection. Every service that wants to receive connections
198
      asks name service task to create a callback connection.</para>
199
 
200
      <para>Tasks can share their address space areas using IPC messages. The
201
      2 message types - AS_AREA_SEND and AS_AREA_RECV are used for sending and
202
      receiving an address area respectively. The shared area can be accessed
114 bondari 203
      as soon as the message is acknowledged.</para>
85 palkovsky 204
    </section>
205
  </section>
206
 
207
  <section>
208
    <title>Userspace view</title>
209
 
210
    <para>The conventional design of the asynchronous api seems to produce
211
    applications with one event loop and several big switch statements.
212
    However, by intensive utilization of user-space threads, it was possible
213
    to create an environment that is not necesarilly restricted to this type
214
    of event-driven programming and allows for more fluent expression of
99 palkovsky 215
    application programs.</para>
85 palkovsky 216
 
217
    <section>
218
      <title>Single point of entry</title>
219
 
220
      <para>Each tasks is associated with only one answerbox. If a
221
      multi-threaded application needs to communicate, it must be not only
222
      able to send a message, but it should be able to retrieve the answer as
223
      well. If several threads pull messages from task answerbox, it is a
224
      matter of fortune, which thread receives which message. If a particular
225
      thread needs to wait for a message answer, an idle
226
      <emphasis>manager</emphasis> task is found or a new one is created and
227
      control is transfered to this manager task. The manager tasks pops
228
      messages from the answerbox and puts them into appropriate queues of
229
      running tasks. If a task waiting for a message is not running, the
99 palkovsky 230
      control is transferred to it.</para>
117 palkovsky 231
 
114 bondari 232
      <figure float="1">
233
        <mediaobject id="ipc2">
234
          <imageobject role="pdf">
235
            <imagedata fileref="images/ipc2.pdf" format="PDF" />
236
          </imageobject>
85 palkovsky 237
 
114 bondari 238
          <imageobject role="html">
239
            <imagedata fileref="images/ipc2.png" format="PNG" />
240
          </imageobject>
241
 
242
          <imageobject role="fop">
243
            <imagedata fileref="images/ipc2.svg" format="SVG" />
244
          </imageobject>
245
        </mediaobject>
246
 
247
        <title>Single point of entry</title>
248
      </figure>
249
 
85 palkovsky 250
      <para>Very similar situation arises when a task decides to send a lot of
251
      messages and reaches kernel limit of asynchronous messages. In such
252
      situation 2 remedies are available - the userspace liberary can either
253
      cache the message locally and resend the message when some answers
254
      arrive, or it can block the thread and let it go on only after the
255
      message is finally sent to the kernel layer. With one exception HelenOS
256
      uses the second approach - when the kernel responds that maximum limit
257
      of asynchronous messages was reached, control is transferred to manager
258
      thread. The manager thread then handles incoming replies and when space
259
      is available, sends the message to kernel and resumes application thread
260
      execution.</para>
261
 
262
      <para>If a kernel notification is received, the servicing procedure is
263
      run in the context of the manager thread. Although it wouldn't be
264
      impossible to allow recursive calling, it could potentially lead to an
265
      explosion of manager threads. Thus, the kernel notification procedures
266
      are not allowed to wait for a message result, they can only answer
267
      messages and send new ones without waiting for their results. If the
268
      kernel limit for outgoing messages is reached, the data is automatically
269
      cached within the application. This behaviour is enforced automatically
270
      and the decision making is hidden from developers view.</para>
271
    </section>
272
 
273
    <section>
112 palkovsky 274
      <title>Ordering problem</title>
85 palkovsky 275
 
117 palkovsky 276
      <para>Unfortunately, the real world is is never so simple. E.g. if a
277
      server handles incoming requests and as a part of its response sends
85 palkovsky 278
      asynchronous messages, it can be easily prempted and other thread may
279
      start intervening. This can happen even if the application utilizes only
280
      1 kernel thread. Classical synchronization using semaphores is not
117 palkovsky 281
      possible, as locking on them would block the thread completely so that
282
      the answer couldn't be ever processed. The IPC framework allows a
283
      developer to specify, that part of the code should not be preempted by
284
      any other thread (except notification handlers) while still being able
285
      to queue messages belonging to other threads and regain control when the
286
      answer arrives.</para>
85 palkovsky 287
 
288
      <para>This mechanism works transparently in multithreaded environment,
117 palkovsky 289
      where additional locking mechanism (futexes) should be used. The IPC
290
      framework ensures that there will always be enough free kernel threads
291
      to handle incoming answers and allow the application to run more
292
      user-space threads inside the kernel threads without the danger of
293
      locking all kernel threads in futexes.</para>
85 palkovsky 294
    </section>
295
 
296
    <section>
297
      <title>The interface</title>
298
 
117 palkovsky 299
      <para>The interface was developed to be as simple to use as possible.
300
      Classical applications simply send messages and occasionally wait for an
301
      answer and check results. If the number of sent messages is higher than
302
      kernel limit, the flow of application is stopped until some answers
303
      arrive. On the other hand server applications are expected to work in a
304
      multithreaded environment.</para>
305
 
306
      <para>The server interface requires developer to specify a
307
      <function>connection_thread</function> function. When new connection is
308
      detected, a new userspace thread is automatically created and control is
309
      transferred to this function. The code then decides whether to accept
310
      the connection and creates a normal event loop. The userspace IPC
311
      library ensures correct switching between several userspace threads
312
      within the kernel environment.</para>
85 palkovsky 313
    </section>
314
  </section>
315
</chapter>