Rev 137 | Rev 160 | Go to most recent revision | Show entire file | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed
Rev 137 | Rev 157 | ||
---|---|---|---|
Line 29... | Line 29... | ||
29 | response.</para> |
29 | response.</para> |
30 | 30 | ||
31 | <para>Every message must be eventually answered. The system keeps track of |
31 | <para>Every message must be eventually answered. The system keeps track of |
32 | all messages, so that it can answer them with appropriate error code |
32 | all messages, so that it can answer them with appropriate error code |
33 | should one of the connection parties fail unexpectedly. To limit buffering |
33 | should one of the connection parties fail unexpectedly. To limit buffering |
34 | of the messages in the kernel, every process is has a limited account of |
34 | of the messages in the kernel, every task has a limit on the amount of |
35 | asynchronous messages it can send simultanously. If the limit is reached, |
35 | asynchronous messages it can send simultaneously. If the limit is reached, |
36 | the kernel refuses to send any other message, until some active message is |
36 | the kernel refuses to send any other message, until some active message is |
37 | answered.</para> |
37 | answered.</para> |
38 | 38 | ||
39 | <para>To facilitate kernel-to-user communication, the IPC subsystem |
39 | <para>To facilitate kernel-to-user communication, the IPC subsystem |
40 | provides notification messages. The applications can subscribe to a |
40 | provides notification messages. The applications can subscribe to a |
Line 47... | Line 47... | ||
47 | <section> |
47 | <section> |
48 | <title>Low Level IPC</title> |
48 | <title>Low Level IPC</title> |
49 | 49 | ||
50 | <para>The whole IPC subsystem consists of one-way communication |
50 | <para>The whole IPC subsystem consists of one-way communication |
51 | channels. Each task has one associated message queue (answerbox). The |
51 | channels. Each task has one associated message queue (answerbox). The |
52 | task can call other tasks and connect it's phones to their answerboxes., |
52 | task can call other tasks and connect its phones to their answerboxes, |
53 | send and forward messages through these connections and answer received |
53 | send and forward messages through these connections and answer received |
54 | messages. Every sent message is identified by a unique number, so that |
54 | messages. Every sent message is identified by a unique number, so that |
55 | the response can be later matched against it. The message is sent over |
55 | the response can be later matched against it. The message is sent over |
56 | the phone to the target answerbox. Server application periodically |
56 | the phone to the target answerbox. The server application periodically |
57 | checks the answerbox and pulls messages from several queues associated |
57 | checks the answerbox and pulls messages from several queues associated |
58 | with it. After completing the requested action, server sends a reply |
58 | with it. After completing the requested action, the server sends a reply |
59 | back to the answerbox of the originating task. If a need arises, it is |
59 | back to the answerbox of the originating task. If a need arises, it is |
60 | possible to <emphasis>forward</emphasis> a recevied message throught any |
60 | possible to <emphasis>forward</emphasis> a received message through any |
61 | of the open phones to another task. This mechanism is used e.g. for |
61 | of the open phones to another task. This mechanism is used e.g. for |
62 | opening new connections.</para> |
62 | opening new connections to services via the naming service.</para> |
63 | 63 | ||
64 | <para>The answerbox contains four different message queues:</para> |
64 | <para>The answerbox contains four different message queues:</para> |
65 | 65 | ||
66 | <itemizedlist> |
66 | <itemizedlist> |
67 | <listitem> |
67 | <listitem> |
Line 98... | Line 98... | ||
98 | </imageobject> |
98 | </imageobject> |
99 | </mediaobject> |
99 | </mediaobject> |
100 | </figure> |
100 | </figure> |
101 | 101 | ||
102 | <para>The communication between task A, that is connected to task B |
102 | <para>The communication between task A, that is connected to task B |
103 | looks as follows: Task A sends a message over it's phone to the target |
103 | looks as follows: task A sends a message over its phone to the target |
104 | asnwerbox. The message is saved in task B incoming call queue. When task |
104 | asnwerbox. The message is saved in task B's incoming call queue. When task |
105 | B fetches the message for processing, it is automatically moved into the |
105 | B fetches the message for processing, it is automatically moved into the |
106 | dispatched call queue. After the server decides to answer the message, |
106 | dispatched call queue. After the server decides to answer the message, |
107 | it is removed from dispatched queue and the result is moved into the |
107 | it is removed from dispatched queue and the result is moved into the |
108 | answer queue of task A.</para> |
108 | answer queue of task A.</para> |
109 | 109 | ||
110 | <para>The arguments contained in the message are completely arbitrary |
110 | <para>The arguments contained in the message are completely arbitrary |
111 | and decided by the user. The low level part of kernel IPC fills in |
111 | and decided by the user. The low level part of kernel IPC fills in |
112 | appropriate error codes if there is an error during communication. It is |
112 | appropriate error codes if there is an error during communication. It is |
113 | assured that the applications are correctly notified about communication |
113 | assured that the applications are correctly notified about communication |
114 | state. If a program closes the outgoing connection, the target answerbox |
114 | state. If a program closes the outgoing connection, the target answerbox |
115 | receives a hangup message. The connection identification is not reused, |
115 | receives a hangup message. The connection identification is not reused |
116 | until the hangup message is acknowledged and all other pending messages |
116 | until the hangup message is acknowledged and all other pending messages |
117 | are answered.</para> |
117 | are answered.</para> |
118 | 118 | ||
119 | <para>Closing an incoming connection is done by responding to any |
119 | <para>Closing an incoming connection is done by responding to any |
120 | incoming message with an EHANGUP error code. The connection is then |
120 | incoming message with an EHANGUP error code. The connection is then |
121 | immediately closed. The client connection identification (phone id) is |
121 | immediately closed. The client connection identification (phone id) is |
122 | not reused, until the client issues closes it's own side of the |
122 | not reused, until the client closes its own side of the |
123 | connection ("hangs his phone up").</para> |
123 | connection ("hangs his phone up").</para> |
124 | 124 | ||
125 | <para>When a task dies (whether voluntarily or by being killed), cleanup |
125 | <para>When a task dies (whether voluntarily or by being killed), cleanup |
126 | process is started.</para> |
126 | process is started.</para> |
127 | 127 | ||
128 | <orderedlist> |
128 | <orderedlist> |
129 | <listitem> |
129 | <listitem> |
130 | <para>Hangs up all outgoing connections and sends hangup messages to |
130 | <para>hangs up all outgoing connections and sends hangup messages to |
131 | all target answerboxes.</para> |
131 | all target answerboxes,</para> |
132 | </listitem> |
132 | </listitem> |
133 | 133 | ||
134 | <listitem> |
134 | <listitem> |
135 | <para>Disconnects all incoming connections.</para> |
135 | <para>disconnects all incoming connections,</para> |
136 | </listitem> |
136 | </listitem> |
137 | 137 | ||
138 | <listitem> |
138 | <listitem> |
139 | <para>Disconnects from all notification channels.</para> |
139 | <para>disconnects from all notification channels,</para> |
140 | </listitem> |
140 | </listitem> |
141 | 141 | ||
142 | <listitem> |
142 | <listitem> |
143 | <para>Answers all unanswered messages from answerbox queues with |
143 | <para>answers all unanswered messages from answerbox queues with |
144 | appropriate error code.</para> |
144 | appropriate error code and</para> |
145 | </listitem> |
145 | </listitem> |
146 | 146 | ||
147 | <listitem> |
147 | <listitem> |
148 | <para>Waits until all outgoing messages are answered and all |
148 | <para>waits until all outgoing messages are answered and all |
149 | remaining answerbox queues are empty.</para> |
149 | remaining answerbox queues are empty.</para> |
150 | </listitem> |
150 | </listitem> |
151 | </orderedlist> |
151 | </orderedlist> |
152 | </section> |
152 | </section> |
153 | 153 | ||
Line 155... | Line 155... | ||
155 | <title>System Call IPC Layer</title> |
155 | <title>System Call IPC Layer</title> |
156 | 156 | ||
157 | <para>On top of this simple protocol the kernel provides special |
157 | <para>On top of this simple protocol the kernel provides special |
158 | services closely related to the inter-process communication. A range of |
158 | services closely related to the inter-process communication. A range of |
159 | method numbers is allocated and protocol is defined for these functions. |
159 | method numbers is allocated and protocol is defined for these functions. |
160 | The messages are interpreted by the kernel layer and appropriate actions |
160 | These messages are interpreted by the kernel layer and appropriate actions |
161 | are taken depending on the parameters of message and answer.</para> |
161 | are taken depending on the parameters of the message and the answer.</para> |
162 | 162 | ||
163 | <para>The kernel provides the following services:</para> |
163 | <para>The kernel provides the following services:</para> |
164 | 164 | ||
165 | <itemizedlist> |
165 | <itemizedlist> |
166 | <listitem> |
166 | <listitem> |
167 | <para>Creating new outgoing connection</para> |
167 | <para>creating new outgoing connection,</para> |
168 | </listitem> |
168 | </listitem> |
169 | 169 | ||
170 | <listitem> |
170 | <listitem> |
171 | <para>Creating a callback connection</para> |
171 | <para>creating a callback connection,</para> |
172 | </listitem> |
172 | </listitem> |
173 | 173 | ||
174 | <listitem> |
174 | <listitem> |
175 | <para>Sending an address space area</para> |
175 | <para>sending an address space area,</para> |
176 | </listitem> |
176 | </listitem> |
177 | 177 | ||
178 | <listitem> |
178 | <listitem> |
179 | <para>Asking for an address space area</para> |
179 | <para>asking for an address space area.</para> |
180 | </listitem> |
180 | </listitem> |
181 | </itemizedlist> |
181 | </itemizedlist> |
182 | 182 | ||
183 | <para>On startup every task is automatically connected to a |
183 | <para>On startup, every task is automatically connected to a |
184 | <emphasis>name service task</emphasis>, which provides a switchboard |
184 | <emphasis>naming service task</emphasis>, which provides a switchboard |
185 | functionality. To open a new outgoing connection, the client sends a |
185 | functionality. In order to open a new outgoing connection, the client sends a |
186 | <constant>CONNECT_ME_TO</constant> message using any of his phones. If |
186 | <constant>CONNECT_ME_TO</constant> message using any of his phones. If |
187 | the recepient of this message answers with an accepting answer, a new |
187 | the recepient of this message answers with an accepting answer, a new |
188 | connection is created. In itself, this mechanism would allow only |
188 | connection is created. In itself, this mechanism would allow only |
189 | duplicating existing connection. However, if the message is forwarded, |
189 | duplicating existing connection. However, if the message is forwarded, |
190 | the new connection is made to the final recipient.</para> |
190 | the new connection is made to the final recipient.</para> |
191 | 191 | ||
192 | <para>On startup every task is automatically connect to the name service |
192 | <para>In order for a task to be able to forward a message, it |
193 | task, which acts as a switchboard and forwards requests for connection |
193 | must have a phone connected to the destination task. |
194 | to specific services. To be able to forward a message it must have a |
194 | The destination task establishes such connection by sending the <constant>CONNECT_TO_ME</constant> |
195 | phone connected to the service tasks. The task creates this connection |
195 | message to the forwarding task. A callback connection is opened afterwards. |
196 | using a <constant>CONNECT_TO_ME</constant> message which creates a |
- | |
197 | callback connection. Every service that wants to receive connections |
196 | Every service that wants to receive connections |
198 | asks name service task to create a callback connection.</para> |
197 | has to ask the naming service to create the callback connection via this mechanism.</para> |
199 | 198 | ||
200 | <para>Tasks can share their address space areas using IPC messages. The |
199 | <para>Tasks can share their address space areas using IPC messages. The |
201 | 2 message types - <constant>AS_AREA_SEND</constant> and <constant>AS_AREA_RECV</constant> are used for sending and |
200 | two message types - <constant>AS_AREA_SEND</constant> and <constant>AS_AREA_RECV</constant> are used for sending and |
202 | receiving an address area respectively. The shared area can be accessed |
201 | receiving an address space area respectively. The shared area can be accessed |
203 | as soon as the message is acknowledged.</para> |
202 | as soon as the message is acknowledged.</para> |
204 | </section> |
203 | </section> |
205 | </section> |
204 | </section> |
206 | 205 | ||
207 | <section> |
206 | <section> |
208 | <title>Userspace View</title> |
207 | <title>Userspace View</title> |
209 | 208 | ||
210 | <para>The conventional design of the asynchronous api seems to produce |
209 | <para>The conventional design of the asynchronous API seems to produce |
211 | applications with one event loop and several big switch statements. |
210 | applications with one event loop and several big switch statements. |
212 | However, by intensive utilization of user-space threads, it was possible |
211 | However, by intensive utilization of userspace pseudo threads, it was possible |
213 | to create an environment that is not necesarilly restricted to this type |
212 | to create an environment that is not necessarily restricted to this type |
214 | of event-driven programming and allows for more fluent expression of |
213 | of event-driven programming and allows for more fluent expression of |
215 | application programs.</para> |
214 | application programs.</para> |
216 | 215 | ||
217 | <section> |
216 | <section> |
218 | <title>Single Point of Entry</title> |
217 | <title>Single Point of Entry</title> |
219 | 218 | ||
220 | <para>Each tasks is associated with only one answerbox. If a |
219 | <para>Each task is associated with only one answerbox. If a |
221 | multi-threaded application needs to communicate, it must be not only |
220 | multithreaded application needs to communicate, it must be not only |
222 | able to send a message, but it should be able to retrieve the answer as |
221 | able to send a message, but it should be able to retrieve the answer as |
223 | well. If several threads pull messages from task answerbox, it is a |
222 | well. If several pseudo threads pull messages from task answerbox, it is a |
224 | matter of fortune, which thread receives which message. If a particular |
223 | matter of coincidence, which thread receives which message. If a particular |
225 | thread needs to wait for a message answer, an idle |
224 | thread needs to wait for a message answer, an idle |
226 | <emphasis>manager</emphasis> task is found or a new one is created and |
225 | <emphasis>manager</emphasis> pseudo thread is found or a new one is created and |
227 | control is transfered to this manager task. The manager tasks pops |
226 | control is transfered to this manager thread. The manager threads pop |
228 | messages from the answerbox and puts them into appropriate queues of |
227 | messages from the answerbox and put them into appropriate queues of |
229 | running tasks. If a task waiting for a message is not running, the |
228 | running threads. If a pseudo thread waiting for a message is not running, the |
230 | control is transferred to it.</para> |
229 | control is transferred to it.</para> |
231 | 230 | ||
232 | <figure float="1"> |
231 | <figure float="1"> |
233 | <title>Single point of entry</title> |
232 | <title>Single point of entry</title> |
234 | <mediaobject id="ipc2"> |
233 | <mediaobject id="ipc2"> |
Line 246... | Line 245... | ||
246 | </mediaobject> |
245 | </mediaobject> |
247 | 246 | ||
248 | </figure> |
247 | </figure> |
249 | 248 | ||
250 | <para>Very similar situation arises when a task decides to send a lot of |
249 | <para>Very similar situation arises when a task decides to send a lot of |
251 | messages and reaches kernel limit of asynchronous messages. In such |
250 | messages and reaches the kernel limit of asynchronous messages. In such |
252 | situation 2 remedies are available - the userspace liberary can either |
251 | situation, two remedies are available - the userspace library can either |
253 | cache the message locally and resend the message when some answers |
252 | cache the message locally and resend the message when some answers |
254 | arrive, or it can block the thread and let it go on only after the |
253 | arrive, or it can block the thread and let it go on only after the |
255 | message is finally sent to the kernel layer. With one exception HelenOS |
254 | message is finally sent to the kernel layer. With one exception, HelenOS |
256 | uses the second approach - when the kernel responds that maximum limit |
255 | uses the second approach - when the kernel responds that the maximum limit |
257 | of asynchronous messages was reached, control is transferred to manager |
256 | of asynchronous messages was reached, the control is transferred to a manager |
258 | thread. The manager thread then handles incoming replies and when space |
257 | pseudo thread. The manager thread then handles incoming replies and, when space |
259 | is available, sends the message to kernel and resumes application thread |
258 | is available, sends the message to the kernel and resumes the application thread |
260 | execution.</para> |
259 | execution.</para> |
261 | 260 | ||
262 | <para>If a kernel notification is received, the servicing procedure is |
261 | <para>If a kernel notification is received, the servicing procedure is |
263 | run in the context of the manager thread. Although it wouldn't be |
262 | run in the context of the manager pseudo thread. Although it wouldn't be |
264 | impossible to allow recursive calling, it could potentially lead to an |
263 | impossible to allow recursive calling, it could potentially lead to an |
265 | explosion of manager threads. Thus, the kernel notification procedures |
264 | explosion of manager threads. Thus, the kernel notification procedures |
266 | are not allowed to wait for a message result, they can only answer |
265 | are not allowed to wait for a message result, they can only answer |
267 | messages and send new ones without waiting for their results. If the |
266 | messages and send new ones without waiting for their results. If the |
268 | kernel limit for outgoing messages is reached, the data is automatically |
267 | kernel limit for outgoing messages is reached, the data is automatically |
269 | cached within the application. This behaviour is enforced automatically |
268 | cached within the application. This behaviour is enforced automatically |
270 | and the decision making is hidden from developers view.</para> |
269 | and the decision making is hidden from the developer.</para> |
271 | 270 | ||
272 | <figure float="1"> |
271 | <figure float="1"> |
273 | <title>Single point of entry solution</title> |
272 | <title>Single point of entry solution</title> |
274 | <mediaobject id="ipc3"> |
273 | <mediaobject id="ipc3"> |
275 | <imageobject role="pdf"> |
274 | <imageobject role="pdf"> |
Line 291... | Line 290... | ||
291 | <section> |
290 | <section> |
292 | <title>Ordering Problem</title> |
291 | <title>Ordering Problem</title> |
293 | 292 | ||
294 | <para>Unfortunately, the real world is is never so simple. E.g. if a |
293 | <para>Unfortunately, the real world is is never so simple. E.g. if a |
295 | server handles incoming requests and as a part of its response sends |
294 | server handles incoming requests and as a part of its response sends |
296 | asynchronous messages, it can be easily prempted and other thread may |
295 | asynchronous messages, it can be easily preempted and another thread may |
297 | start intervening. This can happen even if the application utilizes only |
296 | start intervening. This can happen even if the application utilizes only |
298 | 1 kernel thread. Classical synchronization using semaphores is not |
297 | one userspace thread. Classical synchronization using semaphores is not |
299 | possible, as locking on them would block the thread completely so that |
298 | possible as locking on them would block the thread completely so that |
300 | the answer couldn't be ever processed. The IPC framework allows a |
299 | the answer couldn't be ever processed. The IPC framework allows a |
301 | developer to specify, that part of the code should not be preempted by |
300 | developer to specify, that part of the code should not be preempted by |
302 | any other thread (except notification handlers) while still being able |
301 | any other pseudo thread (except notification handlers) while still being able |
303 | to queue messages belonging to other threads and regain control when the |
302 | to queue messages belonging to other pseudo threads and regain control when the |
304 | answer arrives.</para> |
303 | answer arrives.</para> |
305 | 304 | ||
306 | <para>This mechanism works transparently in multithreaded environment, |
305 | <para>This mechanism works transparently in multithreaded environment, |
307 | where additional locking mechanism (futexes) should be used. The IPC |
306 | where additional locking mechanism (futexes) should be used. The IPC |
308 | framework ensures that there will always be enough free kernel threads |
307 | framework ensures that there will always be enough free userspace threads |
309 | to handle incoming answers and allow the application to run more |
308 | to handle incoming answers and allow the application to run more |
310 | user-space threads inside the kernel threads without the danger of |
309 | pseudo threads inside the usrspace threads without the danger of |
311 | locking all kernel threads in futexes.</para> |
310 | locking all userspace threads in futexes.</para> |
312 | </section> |
311 | </section> |
313 | 312 | ||
314 | <section> |
313 | <section> |
315 | <title>The Interface</title> |
314 | <title>The Interface</title> |
316 | 315 | ||
317 | <para>The interface was developed to be as simple to use as possible. |
316 | <para>The interface was developed to be as simple to use as possible. |
318 | Classical applications simply send messages and occasionally wait for an |
317 | Classical applications simply send messages and occasionally wait for an |
319 | answer and check results. If the number of sent messages is higher than |
318 | answer and check results. If the number of sent messages is higher than |
320 | kernel limit, the flow of application is stopped until some answers |
319 | the kernel limit, the flow of application is stopped until some answers |
321 | arrive. On the other hand server applications are expected to work in a |
320 | arrive. On the other hand, server applications are expected to work in a |
322 | multithreaded environment.</para> |
321 | multithreaded environment.</para> |
323 | 322 | ||
324 | <para>The server interface requires developer to specify a |
323 | <para>The server interface requires the developer to specify a |
325 | <function>connection_thread</function> function. When new connection is |
324 | <function>connection_thread</function> function. When new connection is |
326 | detected, a new userspace thread is automatically created and control is |
325 | detected, a new pseudo thread is automatically created and control is |
327 | transferred to this function. The code then decides whether to accept |
326 | transferred to this function. The code then decides whether to accept |
328 | the connection and creates a normal event loop. The userspace IPC |
327 | the connection and creates a normal event loop. The userspace IPC |
329 | library ensures correct switching between several userspace threads |
328 | library ensures correct switching between several pseudo threads |
330 | within the kernel environment.</para> |
329 | within the kernel environment.</para> |
331 | </section> |
330 | </section> |
332 | </section> |
331 | </section> |
333 | </chapter> |
332 | </chapter> |
334 | 333 |