Rev 99 | Rev 114 | Go to most recent revision | Only display areas with differences | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed
Rev 99 | Rev 112 | ||
---|---|---|---|
1 | <?xml version="1.0" encoding="UTF-8"?> |
1 | <?xml version="1.0" encoding="UTF-8"?> |
2 | <chapter id="ipc"> |
2 | <chapter id="ipc"> |
3 | <?dbhtml filename="ipc.html"?> |
3 | <?dbhtml filename="ipc.html"?> |
4 | 4 | ||
5 | <title>IPC</title> |
5 | <title>IPC</title> |
6 | 6 | ||
7 | <para>Due to the high intertask communication traffic, IPC becomes critical |
7 | <para>Due to the high intertask communication traffic, IPC becomes critical |
8 | subsystem for microkernels, putting high demands on the speed, latency and |
8 | subsystem for microkernels, putting high demands on the speed, latency and |
9 | reliability of IPC model and implementation. Although theoretically the use |
9 | reliability of IPC model and implementation. Although theoretically the use |
10 | of asynchronous messaging system looks promising, it is not often |
10 | of asynchronous messaging system looks promising, it is not often |
11 | implemented because of a problematic implementation of end user |
11 | implemented because of a problematic implementation of end user |
12 | applications. HelenOS implements a fully asynchronous messaging system but |
12 | applications. HelenOS implements a fully asynchronous messaging system with |
13 | with a special layer providing a user application developer a reasonably |
13 | a special layer providing a user application developer a reasonably |
14 | synchronous multithreaded environment sufficient to develop complex |
14 | synchronous multithreaded environment sufficient to develop complex |
15 | protocols.</para> |
15 | protocols.</para> |
16 | 16 | ||
17 | <section> |
17 | <section> |
18 | <title>Services provided by kernel</title> |
18 | <title>Services provided by kernel</title> |
19 | 19 | ||
20 | <para>Every message consists of 4 numeric arguments (32-bit and 64-bit on |
20 | <para>Every message consists of 4 numeric arguments (32-bit and 64-bit on |
21 | the corresponding platforms), from which the first one is considered a |
21 | the corresponding platforms), from which the first one is considered a |
22 | method number on message receipt and a return value on answer receipt. The |
22 | method number on message receipt and a return value on answer receipt. The |
23 | received message contains identification of the incoming connection, so |
23 | received message contains identification of the incoming connection, so |
24 | that the receiving application can distinguish the messages between |
24 | that the receiving application can distinguish the messages between |
25 | different senders. Internally the message contains pointer to the |
25 | different senders. Internally the message contains pointer to the |
26 | originating task and to the source of the communication channel. If the |
26 | originating task and to the source of the communication channel. If the |
27 | message is forwarded, the originating task identifies the recipient of the |
27 | message is forwarded, the originating task identifies the recipient of the |
28 | answer, the source channel identifies the connection in case of a hangup |
28 | answer, the source channel identifies the connection in case of a hangup |
29 | response.</para> |
29 | response.</para> |
30 | 30 | ||
31 | <para>Every message must be eventually answered. The system keeps track of |
31 | <para>Every message must be eventually answered. The system keeps track of |
32 | all messages, so that it can answer them with appropriate error code |
32 | all messages, so that it can answer them with appropriate error code |
33 | should one of the connection parties fail unexpectedly. To limit buffering |
33 | should one of the connection parties fail unexpectedly. To limit buffering |
34 | of the messages in the kernel, every process is has a limited account of |
34 | of the messages in the kernel, every process is has a limited account of |
35 | asynchronous messages it can send simultanously. If the limit is reached, |
35 | asynchronous messages it can send simultanously. If the limit is reached, |
36 | the kernel refuses to send any other message, until some active message is |
36 | the kernel refuses to send any other message, until some active message is |
37 | answered.</para> |
37 | answered.</para> |
38 | 38 | ||
39 | <para>To facilitate kernel-to-user communication, the IPC subsystem |
39 | <para>To facilitate kernel-to-user communication, the IPC subsystem |
40 | provides notification messages. The applications can subscribe to a |
40 | provides notification messages. The applications can subscribe to a |
41 | notification channel and receive messages directed to this channel. Such |
41 | notification channel and receive messages directed to this channel. Such |
42 | messages can be freely sent even from interrupt context as they are |
42 | messages can be freely sent even from interrupt context as they are |
43 | primarily destined to deliver IRQ events to userspace device drivers. |
43 | primarily destined to deliver IRQ events to userspace device drivers. |
44 | These messages need not be answered, there is no party that could receive |
44 | These messages need not be answered, there is no party that could receive |
45 | such response.</para> |
45 | such response.</para> |
46 | 46 | ||
47 | <section> |
47 | <section> |
48 | <title>Low level IPC</title> |
48 | <title>Low level IPC</title> |
49 | 49 | ||
50 | <para>The whole IPC subsystem consists of one-way communication |
50 | <para>The whole IPC subsystem consists of one-way communication |
51 | channels. Each task has one associated message queue (answerbox). The |
51 | channels. Each task has one associated message queue (answerbox). The |
52 | task can open connections (identified by phone id) to other tasks, send |
52 | task can call other tasks and connect it's phones to their answerboxes., |
53 | and forward messages through these connections and answer received |
53 | send and forward messages through these connections and answer received |
54 | messages. Every sent message is identified by a unique number, so that |
54 | messages. Every sent message is identified by a unique number, so that |
55 | the response can be later matched against it. The message is sent over |
55 | the response can be later matched against it. The message is sent over |
56 | the phone to the target answerbox. Server application periodically |
56 | the phone to the target answerbox. Server application periodically |
57 | checks the answerbox and pulls messages from several queues associated |
57 | checks the answerbox and pulls messages from several queues associated |
58 | with it. After completing the requested action, server sends a reply |
58 | with it. After completing the requested action, server sends a reply |
59 | back to the answerbox of the originating task. If a need arises, it is |
59 | back to the answerbox of the originating task. If a need arises, it is |
60 | possible to <emphasis>forward</emphasis> a recevied message throught any |
60 | possible to <emphasis>forward</emphasis> a recevied message throught any |
61 | of the open phones to another task. This mechanism is used e.g. for |
61 | of the open phones to another task. This mechanism is used e.g. for |
62 | opening new connections.</para> |
62 | opening new connections.</para> |
63 | 63 | ||
- | 64 | <para>The answerbox contains four different message queues:</para> |
|
- | 65 | ||
- | 66 | <itemizedlist> |
|
- | 67 | <listitem> |
|
- | 68 | <para>Incoming call queue</para> |
|
- | 69 | </listitem> |
|
- | 70 | ||
- | 71 | <listitem> |
|
- | 72 | <para>Dispatched call queue</para> |
|
- | 73 | </listitem> |
|
- | 74 | ||
- | 75 | <listitem> |
|
- | 76 | <para>Answer queue</para> |
|
- | 77 | </listitem> |
|
- | 78 | ||
- | 79 | <listitem> |
|
- | 80 | <para>Notification queue</para> |
|
- | 81 | </listitem> |
|
- | 82 | </itemizedlist> |
|
- | 83 | ||
- | 84 | <para>The communication between task A, that is connected to task B |
|
- | 85 | looks as follows: Task A sends a message over it's phone to the target |
|
- | 86 | asnwerbox. The message is saved in task B incoming call queue. When task |
|
- | 87 | B fetches the message for processing, it is automatically moved into the |
|
- | 88 | dispatched call queue. After the server decides to answer the message, |
|
- | 89 | it is removed from dispatched queue and the result is moved into the |
|
- | 90 | answer queue of task A.</para> |
|
- | 91 | ||
64 | <para>The arguments contained in the message are completely arbitrary |
92 | <para>The arguments contained in the message are completely arbitrary |
65 | and decided by the user. The low level part of kernel IPC fills in |
93 | and decided by the user. The low level part of kernel IPC fills in |
66 | appropriate error codes if there is an error during communication. It is |
94 | appropriate error codes if there is an error during communication. It is |
67 | ensured that the applications are correctly notified about communication |
95 | assured that the applications are correctly notified about communication |
68 | state. If the outgoing connection is closed with the hangup message, the |
96 | state. If a program closes the outgoing connection, the target answerbox |
69 | target answerbox receives a hangup message. The connection |
97 | receives a hangup message. The connection identification is not reused, |
70 | identification is not reused, until the hangup message is acknowledged |
98 | until the hangup message is acknowledged and all other pending messages |
71 | and all other pending messages are answered.</para> |
99 | are answered.</para> |
72 | 100 | ||
73 | <para>If the server side decides to hangup an incoming connection, it |
101 | <para>Closing an incoming connection is done by responding to any |
74 | does it by responding to any incoming message with an EHANGUP error |
102 | incoming message with an EHANGUP error code. The connection is then |
75 | code. The connection is then immediately closed. The client connection |
103 | immediately closed. The client connection identification (phone id) is |
76 | identification (phone id) is not reused, until the client issues hangup |
104 | not reused, until the client issues closes it's own side of the |
77 | system call to close the outgoing connection.</para> |
105 | connection ("hangs his phone up").</para> |
78 | 106 | ||
79 | <para>When a task dies (whether voluntarily or by being killes), cleanup |
107 | <para>When a task dies (whether voluntarily or by being killed), cleanup |
80 | process is started which performs following tasks.</para> |
108 | process is started. </para> |
81 | 109 | ||
82 | <orderedlist> |
110 | <orderedlist> |
83 | <listitem> |
111 | <listitem> |
84 | <para>Hangs up all outgoing connections and sends hangup messages to |
112 | <para>Hangs up all outgoing connections and sends hangup messages to |
85 | all target answerboxes.</para> |
113 | all target answerboxes.</para> |
86 | </listitem> |
114 | </listitem> |
87 | 115 | ||
88 | <listitem> |
116 | <listitem> |
89 | <para>Disconnects all incoming connections.</para> |
117 | <para>Disconnects all incoming connections.</para> |
90 | </listitem> |
118 | </listitem> |
91 | 119 | ||
92 | <listitem> |
120 | <listitem> |
93 | <para>Disconnects from all notification channels.</para> |
121 | <para>Disconnects from all notification channels.</para> |
94 | </listitem> |
122 | </listitem> |
95 | 123 | ||
96 | <listitem> |
124 | <listitem> |
97 | <para>Answers all unanswered messages from answerbox queues with |
125 | <para>Answers all unanswered messages from answerbox queues with |
98 | appropriate error code.</para> |
126 | appropriate error code.</para> |
99 | </listitem> |
127 | </listitem> |
100 | 128 | ||
101 | <listitem> |
129 | <listitem> |
102 | <para>Waits until all outgoing messages are answered and all |
130 | <para>Waits until all outgoing messages are answered and all |
103 | remaining answerbox queues are empty.</para> |
131 | remaining answerbox queues are empty.</para> |
104 | </listitem> |
132 | </listitem> |
105 | </orderedlist> |
133 | </orderedlist> |
106 | </section> |
134 | </section> |
107 | 135 | ||
108 | <section> |
136 | <section> |
109 | <title>System call IPC layer</title> |
137 | <title>System call IPC layer</title> |
110 | 138 | ||
111 | <para>On top of this simple protocol the kernel provides special |
139 | <para>On top of this simple protocol the kernel provides special |
112 | services closely related to the inter-process communication. A range of |
140 | services closely related to the inter-process communication. A range of |
113 | method numbers is allocated and protocol is defined for these functions. |
141 | method numbers is allocated and protocol is defined for these functions. |
114 | The messages are interpreted by the kernel layer and appropriate actions |
142 | The messages are interpreted by the kernel layer and appropriate actions |
115 | are taken depending on the parameters of message and answer. </para> |
143 | are taken depending on the parameters of message and answer. </para> |
116 | 144 | ||
117 | <para>The kernel provides the following services:</para> |
145 | <para>The kernel provides the following services:</para> |
118 | 146 | ||
119 | <itemizedlist> |
147 | <itemizedlist> |
120 | <listitem> |
148 | <listitem> |
121 | <para>Creating new outgoing connection</para> |
149 | <para>Creating new outgoing connection</para> |
122 | </listitem> |
150 | </listitem> |
123 | 151 | ||
124 | <listitem> |
152 | <listitem> |
125 | <para>Creating a callback connection</para> |
153 | <para>Creating a callback connection</para> |
126 | </listitem> |
154 | </listitem> |
127 | 155 | ||
128 | <listitem> |
156 | <listitem> |
129 | <para>Sending an address space area</para> |
157 | <para>Sending an address space area</para> |
130 | </listitem> |
158 | </listitem> |
131 | 159 | ||
132 | <listitem> |
160 | <listitem> |
133 | <para>Asking for an address space area</para> |
161 | <para>Asking for an address space area</para> |
134 | </listitem> |
162 | </listitem> |
135 | </itemizedlist> |
163 | </itemizedlist> |
- | 164 | ||
- | 165 | <para>On startup every task is automatically connected to a |
|
- | 166 | <emphasis>name service task</emphasis>, which provides a switchboard |
|
- | 167 | functionality. To open a new outgoing connection, the client sends a |
|
- | 168 | <constant>CONNECT_ME_TO</constant> message using any of his phones. If |
|
- | 169 | the recepient of this message answers with an accepting answer, a new |
|
- | 170 | connection is created. In itself, this mechanism would allow only |
|
- | 171 | duplicating existing connection. However, if the message is forwarded, |
|
- | 172 | the new connection is made to the final recipient. </para> |
|
- | 173 | ||
- | 174 | <para>On startup every task is automatically connect to the name service |
|
- | 175 | task, which acts as a switchboard and forwards requests for connection |
|
- | 176 | to specific services. To be able to forward a message it must have a |
|
- | 177 | phone connected to the service tasks. The task creates this connection |
|
- | 178 | using a <constant>CONNECT_TO_ME</constant> message which creates a |
|
- | 179 | callback connection. Every service that wants to receive connections |
|
- | 180 | asks name service task to create a callback connection.</para> |
|
- | 181 | ||
- | 182 | <para>Tasks can share their address space areas using IPC messages. The |
|
- | 183 | 2 message types - AS_AREA_SEND and AS_AREA_RECV are used for sending and |
|
- | 184 | receiving an address area respectively. The shared area can be accessed |
|
- | 185 | as soon as the message is acknowledged. </para> |
|
136 | </section> |
186 | </section> |
137 | </section> |
187 | </section> |
138 | 188 | ||
139 | <section> |
189 | <section> |
140 | <title>Userspace view</title> |
190 | <title>Userspace view</title> |
141 | 191 | ||
142 | <para>The conventional design of the asynchronous api seems to produce |
192 | <para>The conventional design of the asynchronous api seems to produce |
143 | applications with one event loop and several big switch statements. |
193 | applications with one event loop and several big switch statements. |
144 | However, by intensive utilization of user-space threads, it was possible |
194 | However, by intensive utilization of user-space threads, it was possible |
145 | to create an environment that is not necesarilly restricted to this type |
195 | to create an environment that is not necesarilly restricted to this type |
146 | of event-driven programming and allows for more fluent expression of |
196 | of event-driven programming and allows for more fluent expression of |
147 | application programs.</para> |
197 | application programs.</para> |
148 | 198 | ||
149 | <section> |
199 | <section> |
150 | <title>Single point of entry</title> |
200 | <title>Single point of entry</title> |
151 | 201 | ||
152 | <para>Each tasks is associated with only one answerbox. If a |
202 | <para>Each tasks is associated with only one answerbox. If a |
153 | multi-threaded application needs to communicate, it must be not only |
203 | multi-threaded application needs to communicate, it must be not only |
154 | able to send a message, but it should be able to retrieve the answer as |
204 | able to send a message, but it should be able to retrieve the answer as |
155 | well. If several threads pull messages from task answerbox, it is a |
205 | well. If several threads pull messages from task answerbox, it is a |
156 | matter of fortune, which thread receives which message. If a particular |
206 | matter of fortune, which thread receives which message. If a particular |
157 | thread needs to wait for a message answer, an idle |
207 | thread needs to wait for a message answer, an idle |
158 | <emphasis>manager</emphasis> task is found or a new one is created and |
208 | <emphasis>manager</emphasis> task is found or a new one is created and |
159 | control is transfered to this manager task. The manager tasks pops |
209 | control is transfered to this manager task. The manager tasks pops |
160 | messages from the answerbox and puts them into appropriate queues of |
210 | messages from the answerbox and puts them into appropriate queues of |
161 | running tasks. If a task waiting for a message is not running, the |
211 | running tasks. If a task waiting for a message is not running, the |
162 | control is transferred to it.</para> |
212 | control is transferred to it.</para> |
163 | 213 | ||
164 | <para>Very similar situation arises when a task decides to send a lot of |
214 | <para>Very similar situation arises when a task decides to send a lot of |
165 | messages and reaches kernel limit of asynchronous messages. In such |
215 | messages and reaches kernel limit of asynchronous messages. In such |
166 | situation 2 remedies are available - the userspace liberary can either |
216 | situation 2 remedies are available - the userspace liberary can either |
167 | cache the message locally and resend the message when some answers |
217 | cache the message locally and resend the message when some answers |
168 | arrive, or it can block the thread and let it go on only after the |
218 | arrive, or it can block the thread and let it go on only after the |
169 | message is finally sent to the kernel layer. With one exception HelenOS |
219 | message is finally sent to the kernel layer. With one exception HelenOS |
170 | uses the second approach - when the kernel responds that maximum limit |
220 | uses the second approach - when the kernel responds that maximum limit |
171 | of asynchronous messages was reached, control is transferred to manager |
221 | of asynchronous messages was reached, control is transferred to manager |
172 | thread. The manager thread then handles incoming replies and when space |
222 | thread. The manager thread then handles incoming replies and when space |
173 | is available, sends the message to kernel and resumes application thread |
223 | is available, sends the message to kernel and resumes application thread |
174 | execution.</para> |
224 | execution.</para> |
175 | 225 | ||
176 | <para>If a kernel notification is received, the servicing procedure is |
226 | <para>If a kernel notification is received, the servicing procedure is |
177 | run in the context of the manager thread. Although it wouldn't be |
227 | run in the context of the manager thread. Although it wouldn't be |
178 | impossible to allow recursive calling, it could potentially lead to an |
228 | impossible to allow recursive calling, it could potentially lead to an |
179 | explosion of manager threads. Thus, the kernel notification procedures |
229 | explosion of manager threads. Thus, the kernel notification procedures |
180 | are not allowed to wait for a message result, they can only answer |
230 | are not allowed to wait for a message result, they can only answer |
181 | messages and send new ones without waiting for their results. If the |
231 | messages and send new ones without waiting for their results. If the |
182 | kernel limit for outgoing messages is reached, the data is automatically |
232 | kernel limit for outgoing messages is reached, the data is automatically |
183 | cached within the application. This behaviour is enforced automatically |
233 | cached within the application. This behaviour is enforced automatically |
184 | and the decision making is hidden from developers view.</para> |
234 | and the decision making is hidden from developers view.</para> |
185 | </section> |
235 | </section> |
186 | 236 | ||
187 | <section> |
237 | <section> |
188 | <title>Synchronization problem</title> |
238 | <title>Ordering problem</title> |
189 | 239 | ||
190 | <para>Unfortunately, in the real world is is never so easy. E.g. if a |
240 | <para>Unfortunately, in the real world is is never so easy. E.g. if a |
191 | server handles incoming requests and as a part of it's response sends |
241 | server handles incoming requests and as a part of it's response sends |
192 | asynchronous messages, it can be easily prempted and other thread may |
242 | asynchronous messages, it can be easily prempted and other thread may |
193 | start intervening. This can happen even if the application utilizes only |
243 | start intervening. This can happen even if the application utilizes only |
194 | 1 kernel thread. Classical synchronization using semaphores is not |
244 | 1 kernel thread. Classical synchronization using semaphores is not |
195 | possible, as locking on them would block the thread completely and the |
245 | possible, as locking on them would block the thread completely and the |
196 | answer couldn't be ever processed. The IPC framework allows a developer |
246 | answer couldn't be ever processed. The IPC framework allows a developer |
197 | to specify, that the thread should not be preempted to any other thread |
247 | to specify, that the thread should not be preempted to any other thread |
198 | (except notification handlers) while still being able to queue messages |
248 | (except notification handlers) while still being able to queue messages |
199 | belonging to other threads and regain control when the answer |
249 | belonging to other threads and regain control when the answer |
200 | arrives.</para> |
250 | arrives.</para> |
201 | 251 | ||
202 | <para>This mechanism works transparently in multithreaded environment, |
252 | <para>This mechanism works transparently in multithreaded environment, |
203 | where classical locking mechanism (futexes) should be used. The IPC |
253 | where classical locking mechanism (futexes) should be used. The IPC |
204 | framework ensures that there will always be enough free threads to |
254 | framework ensures that there will always be enough free threads to |
205 | handle the threads requiring correct synchronization and allow the |
255 | handle the threads requiring correct synchronization and allow the |
206 | application to run more user-space threads inside the kernel threads |
256 | application to run more user-space threads inside the kernel threads |
207 | without the danger of locking all kernel threads in futexes.</para> |
257 | without the danger of locking all kernel threads in futexes.</para> |
208 | </section> |
258 | </section> |
209 | 259 | ||
210 | <section> |
260 | <section> |
211 | <title>The interface</title> |
261 | <title>The interface</title> |
212 | 262 | ||
213 | <para></para> |
263 | <para></para> |
214 | </section> |
264 | </section> |
215 | </section> |
265 | </section> |
216 | </chapter> |
266 | </chapter> |