Rev 85 | Rev 112 | Go to most recent revision | Details | Compare with Previous | Last modification | View Log | RSS feed
Rev | Author | Line No. | Line |
---|---|---|---|
9 | bondari | 1 | <?xml version="1.0" encoding="UTF-8"?> |
85 | palkovsky | 2 | <chapter id="ipc"> |
3 | <?dbhtml filename="ipc.html"?> |
||
9 | bondari | 4 | |
85 | palkovsky | 5 | <title>IPC</title> |
9 | bondari | 6 | |
85 | palkovsky | 7 | <para>Due to the high intertask communication traffic, IPC becomes critical |
8 | subsystem for microkernels, putting high demands on the speed, latency and |
||
9 | reliability of IPC model and implementation. Although theoretically the use |
||
10 | of asynchronous messaging system looks promising, it is not often |
||
11 | implemented because of a problematic implementation of end user |
||
12 | applications. HelenOS implements a fully asynchronous messaging system but |
||
13 | with a special layer providing a user application developer a reasonably |
||
14 | synchronous multithreaded environment sufficient to develop complex |
||
15 | protocols.</para> |
||
38 | bondari | 16 | |
85 | palkovsky | 17 | <section> |
18 | <title>Services provided by kernel</title> |
||
9 | bondari | 19 | |
85 | palkovsky | 20 | <para>Every message consists of 4 numeric arguments (32-bit and 64-bit on |
21 | the corresponding platforms), from which the first one is considered a |
||
22 | method number on message receipt and a return value on answer receipt. The |
||
23 | received message contains identification of the incoming connection, so |
||
99 | palkovsky | 24 | that the receiving application can distinguish the messages between |
25 | different senders. Internally the message contains pointer to the |
||
26 | originating task and to the source of the communication channel. If the |
||
27 | message is forwarded, the originating task identifies the recipient of the |
||
28 | answer, the source channel identifies the connection in case of a hangup |
||
29 | response.</para> |
||
85 | palkovsky | 30 | |
31 | <para>Every message must be eventually answered. The system keeps track of |
||
32 | all messages, so that it can answer them with appropriate error code |
||
33 | should one of the connection parties fail unexpectedly. To limit buffering |
||
99 | palkovsky | 34 | of the messages in the kernel, every process is has a limited account of |
35 | asynchronous messages it can send simultanously. If the limit is reached, |
||
36 | the kernel refuses to send any other message, until some active message is |
||
37 | answered.</para> |
||
85 | palkovsky | 38 | |
99 | palkovsky | 39 | <para>To facilitate kernel-to-user communication, the IPC subsystem |
40 | provides notification messages. The applications can subscribe to a |
||
41 | notification channel and receive messages directed to this channel. Such |
||
42 | messages can be freely sent even from interrupt context as they are |
||
43 | primarily destined to deliver IRQ events to userspace device drivers. |
||
44 | These messages need not be answered, there is no party that could receive |
||
45 | such response.</para> |
||
46 | |||
85 | palkovsky | 47 | <section> |
48 | <title>Low level IPC</title> |
||
49 | |||
50 | <para>The whole IPC subsystem consists of one-way communication |
||
51 | channels. Each task has one associated message queue (answerbox). The |
||
52 | task can open connections (identified by phone id) to other tasks, send |
||
53 | and forward messages through these connections and answer received |
||
54 | messages. Every sent message is identified by a unique number, so that |
||
55 | the response can be later matched against it. The message is sent over |
||
56 | the phone to the target answerbox. Server application periodically |
||
57 | checks the answerbox and pulls messages from several queues associated |
||
58 | with it. After completing the requested action, server sends a reply |
||
99 | palkovsky | 59 | back to the answerbox of the originating task. If a need arises, it is |
60 | possible to <emphasis>forward</emphasis> a recevied message throught any |
||
61 | of the open phones to another task. This mechanism is used e.g. for |
||
62 | opening new connections.</para> |
||
85 | palkovsky | 63 | |
99 | palkovsky | 64 | <para>The arguments contained in the message are completely arbitrary |
65 | and decided by the user. The low level part of kernel IPC fills in |
||
66 | appropriate error codes if there is an error during communication. It is |
||
67 | ensured that the applications are correctly notified about communication |
||
68 | state. If the outgoing connection is closed with the hangup message, the |
||
69 | target answerbox receives a hangup message. The connection |
||
70 | identification is not reused, until the hangup message is acknowledged |
||
71 | and all other pending messages are answered.</para> |
||
72 | |||
73 | <para>If the server side decides to hangup an incoming connection, it |
||
74 | does it by responding to any incoming message with an EHANGUP error |
||
75 | code. The connection is then immediately closed. The client connection |
||
76 | identification (phone id) is not reused, until the client issues hangup |
||
77 | system call to close the outgoing connection.</para> |
||
78 | |||
79 | <para>When a task dies (whether voluntarily or by being killes), cleanup |
||
80 | process is started which performs following tasks.</para> |
||
81 | |||
82 | <orderedlist> |
||
83 | <listitem> |
||
84 | <para>Hangs up all outgoing connections and sends hangup messages to |
||
85 | all target answerboxes.</para> |
||
86 | </listitem> |
||
87 | |||
88 | <listitem> |
||
89 | <para>Disconnects all incoming connections.</para> |
||
90 | </listitem> |
||
91 | |||
92 | <listitem> |
||
93 | <para>Disconnects from all notification channels.</para> |
||
94 | </listitem> |
||
95 | |||
96 | <listitem> |
||
97 | <para>Answers all unanswered messages from answerbox queues with |
||
98 | appropriate error code.</para> |
||
99 | </listitem> |
||
100 | |||
101 | <listitem> |
||
102 | <para>Waits until all outgoing messages are answered and all |
||
103 | remaining answerbox queues are empty.</para> |
||
104 | </listitem> |
||
105 | </orderedlist> |
||
85 | palkovsky | 106 | </section> |
107 | |||
108 | <section> |
||
99 | palkovsky | 109 | <title>System call IPC layer</title> |
85 | palkovsky | 110 | |
111 | <para>On top of this simple protocol the kernel provides special |
||
99 | palkovsky | 112 | services closely related to the inter-process communication. A range of |
113 | method numbers is allocated and protocol is defined for these functions. |
||
114 | The messages are interpreted by the kernel layer and appropriate actions |
||
115 | are taken depending on the parameters of message and answer. </para> |
||
116 | |||
117 | <para>The kernel provides the following services:</para> |
||
118 | |||
119 | <itemizedlist> |
||
120 | <listitem> |
||
121 | <para>Creating new outgoing connection</para> |
||
122 | </listitem> |
||
123 | |||
124 | <listitem> |
||
125 | <para>Creating a callback connection</para> |
||
126 | </listitem> |
||
127 | |||
128 | <listitem> |
||
129 | <para>Sending an address space area</para> |
||
130 | </listitem> |
||
131 | |||
132 | <listitem> |
||
133 | <para>Asking for an address space area</para> |
||
134 | </listitem> |
||
135 | </itemizedlist> |
||
85 | palkovsky | 136 | </section> |
137 | </section> |
||
138 | |||
139 | <section> |
||
140 | <title>Userspace view</title> |
||
141 | |||
142 | <para>The conventional design of the asynchronous api seems to produce |
||
143 | applications with one event loop and several big switch statements. |
||
144 | However, by intensive utilization of user-space threads, it was possible |
||
145 | to create an environment that is not necesarilly restricted to this type |
||
146 | of event-driven programming and allows for more fluent expression of |
||
99 | palkovsky | 147 | application programs.</para> |
85 | palkovsky | 148 | |
149 | <section> |
||
150 | <title>Single point of entry</title> |
||
151 | |||
152 | <para>Each tasks is associated with only one answerbox. If a |
||
153 | multi-threaded application needs to communicate, it must be not only |
||
154 | able to send a message, but it should be able to retrieve the answer as |
||
155 | well. If several threads pull messages from task answerbox, it is a |
||
156 | matter of fortune, which thread receives which message. If a particular |
||
157 | thread needs to wait for a message answer, an idle |
||
158 | <emphasis>manager</emphasis> task is found or a new one is created and |
||
159 | control is transfered to this manager task. The manager tasks pops |
||
160 | messages from the answerbox and puts them into appropriate queues of |
||
161 | running tasks. If a task waiting for a message is not running, the |
||
99 | palkovsky | 162 | control is transferred to it.</para> |
85 | palkovsky | 163 | |
164 | <para>Very similar situation arises when a task decides to send a lot of |
||
165 | messages and reaches kernel limit of asynchronous messages. In such |
||
166 | situation 2 remedies are available - the userspace liberary can either |
||
167 | cache the message locally and resend the message when some answers |
||
168 | arrive, or it can block the thread and let it go on only after the |
||
169 | message is finally sent to the kernel layer. With one exception HelenOS |
||
170 | uses the second approach - when the kernel responds that maximum limit |
||
171 | of asynchronous messages was reached, control is transferred to manager |
||
172 | thread. The manager thread then handles incoming replies and when space |
||
173 | is available, sends the message to kernel and resumes application thread |
||
174 | execution.</para> |
||
175 | |||
176 | <para>If a kernel notification is received, the servicing procedure is |
||
177 | run in the context of the manager thread. Although it wouldn't be |
||
178 | impossible to allow recursive calling, it could potentially lead to an |
||
179 | explosion of manager threads. Thus, the kernel notification procedures |
||
180 | are not allowed to wait for a message result, they can only answer |
||
181 | messages and send new ones without waiting for their results. If the |
||
182 | kernel limit for outgoing messages is reached, the data is automatically |
||
183 | cached within the application. This behaviour is enforced automatically |
||
184 | and the decision making is hidden from developers view.</para> |
||
185 | </section> |
||
186 | |||
187 | <section> |
||
188 | <title>Synchronization problem</title> |
||
189 | |||
190 | <para>Unfortunately, in the real world is is never so easy. E.g. if a |
||
191 | server handles incoming requests and as a part of it's response sends |
||
192 | asynchronous messages, it can be easily prempted and other thread may |
||
193 | start intervening. This can happen even if the application utilizes only |
||
194 | 1 kernel thread. Classical synchronization using semaphores is not |
||
195 | possible, as locking on them would block the thread completely and the |
||
196 | answer couldn't be ever processed. The IPC framework allows a developer |
||
197 | to specify, that the thread should not be preempted to any other thread |
||
198 | (except notification handlers) while still being able to queue messages |
||
99 | palkovsky | 199 | belonging to other threads and regain control when the answer |
200 | arrives.</para> |
||
85 | palkovsky | 201 | |
202 | <para>This mechanism works transparently in multithreaded environment, |
||
203 | where classical locking mechanism (futexes) should be used. The IPC |
||
204 | framework ensures that there will always be enough free threads to |
||
205 | handle the threads requiring correct synchronization and allow the |
||
206 | application to run more user-space threads inside the kernel threads |
||
207 | without the danger of locking all kernel threads in futexes.</para> |
||
208 | </section> |
||
209 | |||
210 | <section> |
||
211 | <title>The interface</title> |
||
212 | |||
213 | <para></para> |
||
214 | </section> |
||
215 | </section> |
||
216 | </chapter> |