Rev 64 | Rev 66 | Go to most recent revision | Only display areas with differences | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed
Rev 64 | Rev 65 | ||
---|---|---|---|
1 | <?xml version="1.0" encoding="UTF-8"?> |
1 | <?xml version="1.0" encoding="UTF-8"?> |
2 | <chapter id="mm"> |
2 | <chapter id="mm"> |
3 | <?dbhtml filename="mm.html"?> |
3 | <?dbhtml filename="mm.html"?> |
4 | 4 | ||
5 | <title>Memory management</title> |
5 | <title>Memory management</title> |
6 | 6 | ||
7 | <para>In previous chapters, this book described the scheduling subsystem as |
7 | <para>In previous chapters, this book described the scheduling subsystem as |
8 | the creator of the impression that threads execute in parallel. The memory |
8 | the creator of the impression that threads execute in parallel. The memory |
9 | management subsystem, on the other hand, creates the impression that there |
9 | management subsystem, on the other hand, creates the impression that there |
10 | is enough physical memory for the kernel and that userspace tasks have the |
10 | is enough physical memory for the kernel and that userspace tasks have the |
11 | entire address space only for themselves.</para> |
11 | entire address space only for themselves.</para> |
12 | 12 | ||
13 | <section> |
13 | <section> |
14 | <title>Physical memory management</title> |
14 | <title>Physical memory management</title> |
15 | 15 | ||
16 | <section id="zones_and_frames"> |
16 | <section id="zones_and_frames"> |
17 | <title>Zones and frames</title> |
17 | <title>Zones and frames</title> |
18 | 18 | ||
19 | <para>HelenOS represents continuous areas of physical memory in |
19 | <para>HelenOS represents continuous areas of physical memory in |
20 | structures called frame zones (abbreviated as zones). Each zone contains |
20 | structures called frame zones (abbreviated as zones). Each zone contains |
21 | information about the number of allocated and unallocated physical |
21 | information about the number of allocated and unallocated physical |
22 | memory frames as well as the physical base address of the zone and |
22 | memory frames as well as the physical base address of the zone and |
23 | number of frames contained in it. A zone also contains an array of frame |
23 | number of frames contained in it. A zone also contains an array of frame |
24 | structures describing each frame of the zone and, in the last, but not |
24 | structures describing each frame of the zone and, in the last, but not |
25 | the least important, front, each zone is equipped with a buddy system |
25 | the least important, front, each zone is equipped with a buddy system |
26 | that faciliates effective allocation of power-of-two sized block of |
26 | that faciliates effective allocation of power-of-two sized block of |
27 | frames.</para> |
27 | frames.</para> |
28 | 28 | ||
29 | <para>This organization of physical memory provides good preconditions |
29 | <para>This organization of physical memory provides good preconditions |
30 | for hot-plugging of more zones. There is also one currently unused zone |
30 | for hot-plugging of more zones. There is also one currently unused zone |
31 | attribute: <code>flags</code>. The attribute could be used to give a |
31 | attribute: <code>flags</code>. The attribute could be used to give a |
32 | special meaning to some zones in the future.</para> |
32 | special meaning to some zones in the future.</para> |
33 | 33 | ||
34 | <para>The zones are linked in a doubly-linked list. This might seem a |
34 | <para>The zones are linked in a doubly-linked list. This might seem a |
35 | bit ineffective because the zone list is walked everytime a frame is |
35 | bit ineffective because the zone list is walked everytime a frame is |
36 | allocated or deallocated. However, this does not represent a significant |
36 | allocated or deallocated. However, this does not represent a significant |
37 | performance problem as it is expected that the number of zones will be |
37 | performance problem as it is expected that the number of zones will be |
38 | rather low. Moreover, most architectures merge all zones into |
38 | rather low. Moreover, most architectures merge all zones into |
39 | one.</para> |
39 | one.</para> |
40 | 40 | ||
41 | <para>For each physical memory frame found in a zone, there is a frame |
41 | <para>For each physical memory frame found in a zone, there is a frame |
42 | structure that contains number of references and data used by buddy |
42 | structure that contains number of references and data used by buddy |
43 | system.</para> |
43 | system.</para> |
44 | </section> |
44 | </section> |
45 | 45 | ||
46 | <section id="frame_allocator"> |
46 | <section id="frame_allocator"> |
47 | <title>Frame allocator</title> |
47 | <title>Frame allocator</title> |
48 | 48 | ||
49 | <para>The frame allocator satisfies kernel requests to allocate |
49 | <para>The frame allocator satisfies kernel requests to allocate |
50 | power-of-two sized blocks of physical memory. Because of zonal |
50 | power-of-two sized blocks of physical memory. Because of zonal |
51 | organization of physical memory, the frame allocator is always working |
51 | organization of physical memory, the frame allocator is always working |
52 | within a context of some frame zone. In order to carry out the |
52 | within a context of some frame zone. In order to carry out the |
53 | allocation requests, the frame allocator is tightly integrated with the |
53 | allocation requests, the frame allocator is tightly integrated with the |
54 | buddy system belonging to the zone. The frame allocator is also |
54 | buddy system belonging to the zone. The frame allocator is also |
55 | responsible for updating information about the number of free and busy |
55 | responsible for updating information about the number of free and busy |
56 | frames in the zone. <figure> |
56 | frames in the zone. <figure> |
57 | <mediaobject id="frame_alloc"> |
57 | <mediaobject id="frame_alloc"> |
58 | <imageobject role="html"> |
58 | <imageobject role="html"> |
59 | <imagedata fileref="images/frame_alloc.png" format="PNG" /> |
59 | <imagedata fileref="images/frame_alloc.png" format="PNG" /> |
60 | </imageobject> |
60 | </imageobject> |
61 | 61 | ||
62 | <imageobject role="fop"> |
62 | <imageobject role="fop"> |
63 | <imagedata fileref="images.vector/frame_alloc.svg" format="SVG" /> |
63 | <imagedata fileref="images.vector/frame_alloc.svg" format="SVG" /> |
64 | </imageobject> |
64 | </imageobject> |
65 | </mediaobject> |
65 | </mediaobject> |
66 | 66 | ||
67 | <title>Frame allocator scheme.</title> |
67 | <title>Frame allocator scheme.</title> |
68 | </figure></para> |
68 | </figure></para> |
69 | 69 | ||
70 | <formalpara> |
70 | <formalpara> |
71 | <title>Allocation / deallocation</title> |
71 | <title>Allocation / deallocation</title> |
72 | 72 | ||
73 | <para>Upon allocation request via function <code>frame_alloc</code>, |
73 | <para>Upon allocation request via function <code>frame_alloc</code>, |
74 | the frame allocator first tries to find a zone that can satisfy the |
74 | the frame allocator first tries to find a zone that can satisfy the |
75 | request (i.e. has the required amount of free frames). Once a suitable |
75 | request (i.e. has the required amount of free frames). Once a suitable |
76 | zone is found, the frame allocator uses the buddy allocator on the |
76 | zone is found, the frame allocator uses the buddy allocator on the |
77 | zone's buddy system to perform the allocation. During deallocation, |
77 | zone's buddy system to perform the allocation. During deallocation, |
78 | which is triggered by a call to <code>frame_free</code>, the frame |
78 | which is triggered by a call to <code>frame_free</code>, the frame |
79 | allocator looks up the respective zone that contains the frame being |
79 | allocator looks up the respective zone that contains the frame being |
80 | deallocated. Afterwards, it calls the buddy allocator again, this time |
80 | deallocated. Afterwards, it calls the buddy allocator again, this time |
81 | to take care of deallocation within the zone's buddy system.</para> |
81 | to take care of deallocation within the zone's buddy system.</para> |
82 | </formalpara> |
82 | </formalpara> |
83 | </section> |
83 | </section> |
84 | 84 | ||
85 | <section id="buddy_allocator"> |
85 | <section id="buddy_allocator"> |
86 | <title>Buddy allocator</title> |
86 | <title>Buddy allocator</title> |
87 | 87 | ||
88 | <para>In the buddy system, the memory is broken down into power-of-two |
88 | <para>In the buddy system, the memory is broken down into power-of-two |
89 | sized naturally aligned blocks. These blocks are organized in an array |
89 | sized naturally aligned blocks. These blocks are organized in an array |
90 | of lists, in which the list with index i contains all unallocated blocks |
90 | of lists, in which the list with index i contains all unallocated blocks |
91 | of size <mathphrase>2<superscript>i</superscript></mathphrase>. The |
91 | of size <mathphrase>2<superscript>i</superscript></mathphrase>. The |
92 | index i is called the order of block. Should there be two adjacent |
92 | index i is called the order of block. Should there be two adjacent |
93 | equally sized blocks in the list i<mathphrase />(i.e. buddies), the |
93 | equally sized blocks in the list i<mathphrase />(i.e. buddies), the |
94 | buddy allocator would coalesce them and put the resulting block in list |
94 | buddy allocator would coalesce them and put the resulting block in list |
95 | <mathphrase>i + 1</mathphrase>, provided that the resulting block would |
95 | <mathphrase>i + 1</mathphrase>, provided that the resulting block would |
96 | be naturally aligned. Similarily, when the allocator is asked to |
96 | be naturally aligned. Similarily, when the allocator is asked to |
97 | allocate a block of size |
97 | allocate a block of size |
98 | <mathphrase>2<superscript>i</superscript></mathphrase>, it first tries |
98 | <mathphrase>2<superscript>i</superscript></mathphrase>, it first tries |
99 | to satisfy the request from the list with index i. If the request cannot |
99 | to satisfy the request from the list with index i. If the request cannot |
100 | be satisfied (i.e. the list i is empty), the buddy allocator will try to |
100 | be satisfied (i.e. the list i is empty), the buddy allocator will try to |
101 | allocate and split a larger block from the list with index i + 1. Both |
101 | allocate and split a larger block from the list with index i + 1. Both |
102 | of these algorithms are recursive. The recursion ends either when there |
102 | of these algorithms are recursive. The recursion ends either when there |
103 | are no blocks to coalesce in the former case or when there are no blocks |
103 | are no blocks to coalesce in the former case or when there are no blocks |
104 | that can be split in the latter case.</para> |
104 | that can be split in the latter case.</para> |
105 | 105 | ||
106 | <para>This approach greatly reduces external fragmentation of memory and |
106 | <para>This approach greatly reduces external fragmentation of memory and |
107 | helps in allocating bigger continuous blocks of memory aligned to their |
107 | helps in allocating bigger continuous blocks of memory aligned to their |
108 | size. On the other hand, the buddy allocator suffers increased internal |
108 | size. On the other hand, the buddy allocator suffers increased internal |
109 | fragmentation of memory and is not suitable for general kernel |
109 | fragmentation of memory and is not suitable for general kernel |
110 | allocations. This purpose is better addressed by the <link |
110 | allocations. This purpose is better addressed by the <link |
111 | linkend="slab">slab allocator</link>.<figure> |
111 | linkend="slab">slab allocator</link>.<figure> |
112 | <mediaobject id="buddy_alloc"> |
112 | <mediaobject id="buddy_alloc"> |
113 | <imageobject role="html"> |
113 | <imageobject role="html"> |
114 | <imagedata fileref="images/buddy_alloc.png" format="PNG" /> |
114 | <imagedata fileref="images/buddy_alloc.png" format="PNG" /> |
115 | </imageobject> |
115 | </imageobject> |
116 | 116 | ||
117 | <imageobject role="fop"> |
117 | <imageobject role="fop"> |
118 | <imagedata fileref="images.vector/buddy_alloc.svg" format="SVG" /> |
118 | <imagedata fileref="images.vector/buddy_alloc.svg" format="SVG" /> |
119 | </imageobject> |
119 | </imageobject> |
120 | </mediaobject> |
120 | </mediaobject> |
121 | 121 | ||
122 | <title>Buddy system scheme.</title> |
122 | <title>Buddy system scheme.</title> |
123 | </figure></para> |
123 | </figure></para> |
124 | 124 | ||
125 | <section> |
125 | <section> |
126 | <title>Implementation</title> |
126 | <title>Implementation</title> |
127 | 127 | ||
128 | <para>The buddy allocator is, in fact, an abstract framework wich can |
128 | <para>The buddy allocator is, in fact, an abstract framework wich can |
129 | be easily specialized to serve one particular task. It knows nothing |
129 | be easily specialized to serve one particular task. It knows nothing |
130 | about the nature of memory it helps to allocate. In order to beat the |
130 | about the nature of memory it helps to allocate. In order to beat the |
131 | lack of this knowledge, the buddy allocator exports an interface that |
131 | lack of this knowledge, the buddy allocator exports an interface that |
132 | each of its clients is required to implement. When supplied with an |
132 | each of its clients is required to implement. When supplied with an |
133 | implementation of this interface, the buddy allocator can use |
133 | implementation of this interface, the buddy allocator can use |
134 | specialized external functions to find a buddy for a block, split and |
134 | specialized external functions to find a buddy for a block, split and |
135 | coalesce blocks, manipulate block order and mark blocks busy or |
135 | coalesce blocks, manipulate block order and mark blocks busy or |
136 | available.</para> |
136 | available.</para> |
137 | 137 | ||
138 | <formalpara> |
138 | <formalpara> |
139 | <title>Data organization</title> |
139 | <title>Data organization</title> |
140 | 140 | ||
141 | <para>Each entity allocable by the buddy allocator is required to |
141 | <para>Each entity allocable by the buddy allocator is required to |
142 | contain space for storing block order number and a link variable |
142 | contain space for storing block order number and a link variable |
143 | used to interconnect blocks within the same order.</para> |
143 | used to interconnect blocks within the same order.</para> |
144 | 144 | ||
145 | <para>Whatever entities are allocated by the buddy allocator, the |
145 | <para>Whatever entities are allocated by the buddy allocator, the |
146 | first entity within a block is used to represent the entire block. |
146 | first entity within a block is used to represent the entire block. |
147 | The first entity keeps the order of the whole block. Other entities |
147 | The first entity keeps the order of the whole block. Other entities |
148 | within the block are assigned the magic value |
148 | within the block are assigned the magic value |
149 | <constant>BUDDY_INNER_BLOCK</constant>. This is especially important |
149 | <constant>BUDDY_INNER_BLOCK</constant>. This is especially important |
150 | for effective identification of buddies in a one-dimensional array |
150 | for effective identification of buddies in a one-dimensional array |
151 | because the entity that represents a potential buddy cannot be |
151 | because the entity that represents a potential buddy cannot be |
152 | associated with <constant>BUDDY_INNER_BLOCK</constant> (i.e. if it |
152 | associated with <constant>BUDDY_INNER_BLOCK</constant> (i.e. if it |
153 | is associated with <constant>BUDDY_INNER_BLOCK</constant> then it is |
153 | is associated with <constant>BUDDY_INNER_BLOCK</constant> then it is |
154 | not a buddy).</para> |
154 | not a buddy).</para> |
155 | </formalpara> |
155 | </formalpara> |
156 | </section> |
156 | </section> |
157 | </section> |
157 | </section> |
158 | 158 | ||
159 | <section id="slab"> |
159 | <section id="slab"> |
160 | <title>Slab allocator</title> |
160 | <title>Slab allocator</title> |
161 | 161 | ||
162 | <para>The majority of memory allocation requests in the kernel is for |
162 | <para>The majority of memory allocation requests in the kernel is for |
163 | small, frequently used data structures. The basic idea behind the slab |
163 | small, frequently used data structures. The basic idea behind the slab |
164 | allocator is that deployment of lists of preallocated, commonly used |
164 | allocator is that commonly used objects are preallocated in continuous |
165 | objects. Whenever an object is to be allocated, the slab allocator takes |
165 | areas of physical memory called slabs<footnote> |
166 | the first item from the list corresponding to the object type. This |
166 | <para>Slabs are in fact blocks of physical memory frames allocated |
167 | avoids the overhead of allocating and dellocating commonly used types of |
167 | from the frame allocator.</para> |
168 | objects such as threads, B+tree nodes etc. Due to the fact that the |
- | |
169 | sizes of the requested and allocated object match, the slab allocator |
168 | </footnote>. Whenever an object is to be allocated, the slab allocator |
170 | significantly eliminates internal fragmentation. Moreover, each list can |
169 | returns the first available item from a suitable slab corresponding to |
171 | have a constructor and a destructor, which leads to performance gains |
170 | the object type<footnote> |
172 | because constructed and then destroyed objects don't need to be |
171 | <para>The mechanism is rather more complicated, see the next |
173 | reinitialized<footnote> |
172 | paragraph.</para> |
174 | <para>Provided that the assumption that the destructor leaves the |
173 | </footnote>. Due to the fact that the sizes of the requested and |
175 | object in a consistent state holds.</para> |
174 | allocated object match, the slab allocator significantly freduces |
176 | </footnote>.</para> |
175 | internal fragmentation.</para> |
177 | 176 | ||
178 | <para>In the slab allocator, objects of one type are kept in continuous |
177 | <para>Slabs of one object type are organized in a structure called slab |
179 | areas of physical memory called slabs. Slabs can span from one to |
178 | cache. There are ususally more slabs in the slab cache, depending on |
180 | several physical memory frames. Slabs of objects of one type are stored |
179 | previous allocations. If the the slab cache runs out of available slabs, |
181 | in slab caches. When the allocator needs to allocate an object, it |
180 | new slabs are allocated. In order to exploit parallelism and to avoid |
182 | searches available slabs. When the slab does not contain any free |
181 | locking of shared spinlocks, slab caches can have variants of |
- | 182 | CPU-private slabs called magazines. Each object begins its life in a |
|
183 | obejct, a new slab is allocated and added to the cache. Contrary to |
183 | slab. When it is allocated from there, the slab allocator calls a |
184 | allocation, deallocated objects are returned to their respective slabs. |
184 | constructor that is registered in the respective slab cache. The |
- | 185 | constructor initializes and brings the object into a known state. The |
|
- | 186 | object is then used by the user. When the user later frees the object, |
|
- | 187 | the slab allocator puts it into a CPU-private magazine, from where it |
|
185 | Empty slabs are deallocated immediately while empty slab caches are not |
188 | can be precedently allocated again. Note that allocations satisfied from |
- | 189 | a magazine are already initialized by the constructor.</para> |
|
- | 190 | ||
186 | freed until HelenOS runs short of memory.</para> |
191 | <para>Should HelenOS run short of memory, it would start deallocating |
- | 192 | objects from magazines, calling slab cache destructor on them and |
|
- | 193 | putting them back into slabs. When a slab contanins no allocated object, |
|
- | 194 | it is immediately freed.</para> |
|
187 | 195 | ||
188 | <para><figure> |
196 | <para><figure> |
189 | <mediaobject id="slab_alloc"> |
197 | <mediaobject id="slab_alloc"> |
190 | <imageobject role="html"> |
198 | <imageobject role="html"> |
191 | <imagedata fileref="images/slab_alloc.png" format="PNG" /> |
199 | <imagedata fileref="images/slab_alloc.png" format="PNG" /> |
192 | </imageobject> |
200 | </imageobject> |
193 | - | ||
194 | <imageobject role="fop"> |
- | |
195 | <imagedata fileref="images.vector/slab_alloc.svg" format="SVG" /> |
- | |
196 | </imageobject> |
- | |
197 | </mediaobject> |
201 | </mediaobject> |
198 | 202 | ||
199 | <title>Slab allocator scheme.</title> |
203 | <title>Slab allocator scheme.</title> |
200 | </figure></para> |
204 | </figure></para> |
201 | 205 | ||
202 | <section> |
206 | <section> |
203 | <para> |
- | |
204 | <termdef /> |
- | |
205 | - | ||
206 | |
- | |
207 | - | ||
208 | <termdef /> |
- | |
209 | </para> |
- | |
210 | </section> |
- | |
211 | - | ||
212 | <section> |
- | |
213 | <title>Implementation</title> |
207 | <title>Implementation</title> |
214 | 208 | ||
215 | <para>The slab allocator is closely modelled after <ulink |
209 | <para>The slab allocator is closely modelled after <ulink |
216 | url="http://www.usenix.org/events/usenix01/full_papers/bonwick/bonwick_html/"> |
210 | url="http://www.usenix.org/events/usenix01/full_papers/bonwick/bonwick_html/"> |
217 | OpenSolaris slab allocator by Jeff Bonwick and Jonathan Adams </ulink> |
211 | OpenSolaris slab allocator by Jeff Bonwick and Jonathan Adams </ulink> |
218 | with the following exceptions:<itemizedlist> |
212 | with the following exceptions:<itemizedlist> |
219 | <listitem></listitem> |
- | |
220 | - | ||
221 | <listitem> |
213 | <listitem> |
222 | empty magazines are deallocated when not needed |
214 | empty magazines are deallocated when not needed |
223 | </listitem> |
215 | </listitem> |
224 | </itemizedlist> Following features are not currently supported but |
216 | </itemizedlist> Following features are not currently supported but |
225 | would be easy to do: <itemizedlist> |
217 | would be easy to do: <itemizedlist> |
226 | <listitem> |
218 | <listitem> |
227 | cache coloring |
219 | cache coloring |
228 | </listitem> |
220 | </listitem> |
229 | 221 | ||
230 | <listitem> |
222 | <listitem> |
231 | dynamic magazine grow (different magazine sizes are already supported, but we would need to adjust allocation strategy) |
223 | dynamic magazine grow (different magazine sizes are already supported, but we would need to adjust allocation strategy) |
232 | </listitem> |
224 | </listitem> |
233 | </itemizedlist></para> |
225 | </itemizedlist></para> |
234 | 226 | ||
235 | <section> |
227 | <section> |
236 | <title>Magazine layer</title> |
228 | <title>Magazine layer</title> |
237 | 229 | ||
238 | <para>Due to the extensive bottleneck on SMP architures, caused by |
230 | <para>Due to the extensive bottleneck on SMP architures, caused by |
239 | global slab locking mechanism, making processing of all slab |
231 | global slab locking mechanism, making processing of all slab |
240 | allocation requests serialized, a new layer was introduced to the |
232 | allocation requests serialized, a new layer was introduced to the |
241 | classic slab allocator design. Slab allocator was extended to |
233 | classic slab allocator design. Slab allocator was extended to |
242 | support per-CPU caches 'magazines' to achieve good SMP scaling. |
234 | support per-CPU caches 'magazines' to achieve good SMP scaling. |
243 | <termdef>Slab SMP perfromance bottleneck was resolved by introducing |
235 | <termdef>Slab SMP perfromance bottleneck was resolved by introducing |
244 | a per-CPU caching scheme called as <glossterm>magazine |
236 | a per-CPU caching scheme called as <glossterm>magazine |
245 | layer</glossterm></termdef>.</para> |
237 | layer</glossterm></termdef>.</para> |
246 | 238 | ||
247 | <para>Magazine is a N-element cache of objects, so each magazine can |
239 | <para>Magazine is a N-element cache of objects, so each magazine can |
248 | satisfy N allocations. Magazine behaves like a automatic weapon |
240 | satisfy N allocations. Magazine behaves like a automatic weapon |
249 | magazine (LIFO, stack), so the allocation/deallocation become simple |
241 | magazine (LIFO, stack), so the allocation/deallocation become simple |
250 | push/pop pointer operation. Trick is that CPU does not access global |
242 | push/pop pointer operation. Trick is that CPU does not access global |
251 | slab allocator data during the allocation from its magazine, thus |
243 | slab allocator data during the allocation from its magazine, thus |
252 | making possible parallel allocations between CPUs.</para> |
244 | making possible parallel allocations between CPUs.</para> |
253 | 245 | ||
254 | <para>Implementation also requires adding another feature as the |
246 | <para>Implementation also requires adding another feature as the |
255 | CPU-bound magazine is actually a pair of magazines to avoid |
247 | CPU-bound magazine is actually a pair of magazines to avoid |
256 | thrashing when during allocation/deallocatiion of 1 item at the |
248 | thrashing when during allocation/deallocatiion of 1 item at the |
257 | magazine size boundary. LIFO order is enforced, which should avoid |
249 | magazine size boundary. LIFO order is enforced, which should avoid |
258 | fragmentation as much as possible.</para> |
250 | fragmentation as much as possible.</para> |
259 | 251 | ||
260 | <para>Another important entity of magazine layer is the common full |
252 | <para>Another important entity of magazine layer is the common full |
261 | magazine list (also called a depot), that stores full magazines that |
253 | magazine list (also called a depot), that stores full magazines that |
262 | may be used by any of the CPU magazine caches to reload active CPU |
254 | may be used by any of the CPU magazine caches to reload active CPU |
263 | magazine. This list of magazines can be pre-filled with full |
255 | magazine. This list of magazines can be pre-filled with full |
264 | magazines during initialization, but in current implementation it is |
256 | magazines during initialization, but in current implementation it is |
265 | filled during object deallocation, when CPU magazine becomes |
257 | filled during object deallocation, when CPU magazine becomes |
266 | full.</para> |
258 | full.</para> |
267 | 259 | ||
268 | <para>Slab allocator control structures are allocated from special |
260 | <para>Slab allocator control structures are allocated from special |
269 | slabs, that are marked by special flag, indicating that it should |
261 | slabs, that are marked by special flag, indicating that it should |
270 | not be used for slab magazine layer. This is done to avoid possible |
262 | not be used for slab magazine layer. This is done to avoid possible |
271 | infinite recursions and deadlock during conventional slab allocaiton |
263 | infinite recursions and deadlock during conventional slab allocaiton |
272 | requests.</para> |
264 | requests.</para> |
273 | </section> |
265 | </section> |
274 | 266 | ||
275 | <section> |
267 | <section> |
276 | <title>Allocation/deallocation</title> |
268 | <title>Allocation/deallocation</title> |
277 | 269 | ||
278 | <para>Every cache contains list of full slabs and list of partialy |
270 | <para>Every cache contains list of full slabs and list of partialy |
279 | full slabs. Empty slabs are immediately freed (thrashing will be |
271 | full slabs. Empty slabs are immediately freed (thrashing will be |
280 | avoided because of magazines).</para> |
272 | avoided because of magazines).</para> |
281 | 273 | ||
282 | <para>The SLAB allocator allocates lots of space and does not free |
274 | <para>The SLAB allocator allocates lots of space and does not free |
283 | it. When frame allocator fails to allocate the frame, it calls |
275 | it. When frame allocator fails to allocate the frame, it calls |
284 | slab_reclaim(). It tries 'light reclaim' first, then brutal reclaim. |
276 | slab_reclaim(). It tries 'light reclaim' first, then brutal reclaim. |
285 | The light reclaim releases slabs from cpu-shared magazine-list, |
277 | The light reclaim releases slabs from cpu-shared magazine-list, |
286 | until at least 1 slab is deallocated in each cache (this algorithm |
278 | until at least 1 slab is deallocated in each cache (this algorithm |
287 | should probably change). The brutal reclaim removes all cached |
279 | should probably change). The brutal reclaim removes all cached |
288 | objects, even from CPU-bound magazines.</para> |
280 | objects, even from CPU-bound magazines.</para> |
289 | 281 | ||
290 | <formalpara> |
282 | <formalpara> |
291 | <title>Allocation</title> |
283 | <title>Allocation</title> |
292 | 284 | ||
293 | <para><emphasis>Step 1.</emphasis> When it comes to the allocation |
285 | <para><emphasis>Step 1.</emphasis> When it comes to the allocation |
294 | request, slab allocator first of all checks availability of memory |
286 | request, slab allocator first of all checks availability of memory |
295 | in local CPU-bound magazine. If it is there, we would just "pop" |
287 | in local CPU-bound magazine. If it is there, we would just "pop" |
296 | the CPU magazine and return the pointer to object.</para> |
288 | the CPU magazine and return the pointer to object.</para> |
297 | 289 | ||
298 | <para><emphasis>Step 2.</emphasis> If the CPU-bound magazine is |
290 | <para><emphasis>Step 2.</emphasis> If the CPU-bound magazine is |
299 | empty, allocator will attempt to reload magazin, swapping it with |
291 | empty, allocator will attempt to reload magazin, swapping it with |
300 | second CPU magazine and returns to the first step.</para> |
292 | second CPU magazine and returns to the first step.</para> |
301 | 293 | ||
302 | <para><emphasis>Step 3.</emphasis> Now we are in the situation |
294 | <para><emphasis>Step 3.</emphasis> Now we are in the situation |
303 | when both CPU-bound magazines are empty, which makes allocator to |
295 | when both CPU-bound magazines are empty, which makes allocator to |
304 | access shared full-magazines depot to reload CPU-bound magazines. |
296 | access shared full-magazines depot to reload CPU-bound magazines. |
305 | If reload is succesful (meaning there are full magazines in depot) |
297 | If reload is succesful (meaning there are full magazines in depot) |
306 | algoritm continues at Step 1.</para> |
298 | algoritm continues at Step 1.</para> |
307 | 299 | ||
308 | <para><emphasis>Step 4.</emphasis> Final step of the allocation. |
300 | <para><emphasis>Step 4.</emphasis> Final step of the allocation. |
309 | In this step object is allocated from the conventional slab layer |
301 | In this step object is allocated from the conventional slab layer |
310 | and pointer is returned.</para> |
302 | and pointer is returned.</para> |
311 | </formalpara> |
303 | </formalpara> |
312 | 304 | ||
313 | <formalpara> |
305 | <formalpara> |
314 | <title>Deallocation</title> |
306 | <title>Deallocation</title> |
315 | 307 | ||
316 | <para><emphasis>Step 1.</emphasis> During deallocation request, |
308 | <para><emphasis>Step 1.</emphasis> During deallocation request, |
317 | slab allocator will check if the local CPU-bound magazine is not |
309 | slab allocator will check if the local CPU-bound magazine is not |
318 | full. In this case we will just push the pointer to this |
310 | full. In this case we will just push the pointer to this |
319 | magazine.</para> |
311 | magazine.</para> |
320 | 312 | ||
321 | <para><emphasis>Step 2.</emphasis> If the CPU-bound magazine is |
313 | <para><emphasis>Step 2.</emphasis> If the CPU-bound magazine is |
322 | full, allocator will attempt to reload magazin, swapping it with |
314 | full, allocator will attempt to reload magazin, swapping it with |
323 | second CPU magazine and returns to the first step.</para> |
315 | second CPU magazine and returns to the first step.</para> |
324 | 316 | ||
325 | <para><emphasis>Step 3.</emphasis> Now we are in the situation |
317 | <para><emphasis>Step 3.</emphasis> Now we are in the situation |
326 | when both CPU-bound magazines are full, which makes allocator to |
318 | when both CPU-bound magazines are full, which makes allocator to |
327 | access shared full-magazines depot to put one of the magazines to |
319 | access shared full-magazines depot to put one of the magazines to |
328 | the depot and creating new empty magazine. Algoritm continues at |
320 | the depot and creating new empty magazine. Algoritm continues at |
329 | Step 1.</para> |
321 | Step 1.</para> |
330 | </formalpara> |
322 | </formalpara> |
331 | </section> |
323 | </section> |
332 | </section> |
324 | </section> |
333 | </section> |
325 | </section> |
334 | 326 | ||
335 | <!-- End of Physmem --> |
327 | <!-- End of Physmem --> |
336 | </section> |
328 | </section> |
337 | 329 | ||
338 | <section> |
330 | <section> |
339 | <title>Virtual memory management</title> |
331 | <title>Virtual memory management</title> |
340 | 332 | ||
341 | <section> |
333 | <section> |
342 | <title>Introduction</title> |
334 | <title>Introduction</title> |
343 | 335 | ||
344 | <para>Virtual memory is a special memory management technique, used by |
336 | <para>Virtual memory is a special memory management technique, used by |
345 | kernel to achieve a bunch of mission critical goals. <itemizedlist> |
337 | kernel to achieve a bunch of mission critical goals. <itemizedlist> |
346 | <listitem> |
338 | <listitem> |
347 | Isolate each task from other tasks that are running on the system at the same time. |
339 | Isolate each task from other tasks that are running on the system at the same time. |
348 | </listitem> |
340 | </listitem> |
349 | 341 | ||
350 | <listitem> |
342 | <listitem> |
351 | Allow to allocate more memory, than is actual physical memory size of the machine. |
343 | Allow to allocate more memory, than is actual physical memory size of the machine. |
352 | </listitem> |
344 | </listitem> |
353 | 345 | ||
354 | <listitem> |
346 | <listitem> |
355 | Allowing, in general, to load and execute two programs that are linked on the same address without complicated relocations. |
347 | Allowing, in general, to load and execute two programs that are linked on the same address without complicated relocations. |
356 | </listitem> |
348 | </listitem> |
357 | </itemizedlist></para> |
349 | </itemizedlist></para> |
358 | 350 | ||
359 | <para><!-- |
351 | <para><!-- |
360 | 352 | ||
361 | TLB shootdown ASID/ASID:PAGE/ALL. |
353 | TLB shootdown ASID/ASID:PAGE/ALL. |
362 | TLB shootdown requests can come in asynchroniously |
354 | TLB shootdown requests can come in asynchroniously |
363 | so there is a cache of TLB shootdown requests. Upon cache overflow TLB shootdown ALL is executed |
355 | so there is a cache of TLB shootdown requests. Upon cache overflow TLB shootdown ALL is executed |
364 | 356 | ||
365 | 357 | ||
366 | <para> |
358 | <para> |
367 | Address spaces. Address space area (B+ tree). Only for uspace. Set of syscalls (shrink/extend etc). |
359 | Address spaces. Address space area (B+ tree). Only for uspace. Set of syscalls (shrink/extend etc). |
368 | Special address space area type - device - prohibits shrink/extend syscalls to call on it. |
360 | Special address space area type - device - prohibits shrink/extend syscalls to call on it. |
369 | Address space has link to mapping tables (hierarchical - per Address space, hash - global tables). |
361 | Address space has link to mapping tables (hierarchical - per Address space, hash - global tables). |
370 | </para> |
362 | </para> |
371 | 363 | ||
372 | --></para> |
364 | --></para> |
373 | </section> |
365 | </section> |
374 | 366 | ||
375 | <section> |
367 | <section> |
376 | <title>Paging</title> |
368 | <title>Paging</title> |
377 | 369 | ||
378 | <para>Virtual memory is usually using paged memory model, where virtual |
370 | <para>Virtual memory is usually using paged memory model, where virtual |
379 | memory address space is divided into the <emphasis>pages</emphasis> |
371 | memory address space is divided into the <emphasis>pages</emphasis> |
380 | (usually having size 4096 bytes) and physical memory is divided into the |
372 | (usually having size 4096 bytes) and physical memory is divided into the |
381 | frames (same sized as a page, of course). Each page may be mapped to |
373 | frames (same sized as a page, of course). Each page may be mapped to |
382 | some frame and then, upon memory access to the virtual address, CPU |
374 | some frame and then, upon memory access to the virtual address, CPU |
383 | performs <emphasis>address translation</emphasis> during the instruction |
375 | performs <emphasis>address translation</emphasis> during the instruction |
384 | execution. Non-existing mapping generates page fault exception, calling |
376 | execution. Non-existing mapping generates page fault exception, calling |
385 | kernel exception handler, thus allowing kernel to manipulate rules of |
377 | kernel exception handler, thus allowing kernel to manipulate rules of |
386 | memory access. Information for pages mapping is stored by kernel in the |
378 | memory access. Information for pages mapping is stored by kernel in the |
387 | <link linkend="page_tables">page tables</link></para> |
379 | <link linkend="page_tables">page tables</link></para> |
388 | 380 | ||
389 | <para>The majority of the architectures use multi-level page tables, |
381 | <para>The majority of the architectures use multi-level page tables, |
390 | which means need to access physical memory several times before getting |
382 | which means need to access physical memory several times before getting |
391 | physical address. This fact would make serios performance overhead in |
383 | physical address. This fact would make serios performance overhead in |
392 | virtual memory management. To avoid this <link linkend="tlb">Traslation |
384 | virtual memory management. To avoid this <link linkend="tlb">Traslation |
393 | Lookaside Buffer (TLB)</link> is used.</para> |
385 | Lookaside Buffer (TLB)</link> is used.</para> |
394 | </section> |
386 | </section> |
395 | 387 | ||
396 | <section> |
388 | <section> |
397 | <title>Address spaces</title> |
389 | <title>Address spaces</title> |
398 | 390 | ||
399 | <section> |
391 | <section> |
400 | <title>Address space areas</title> |
392 | <title>Address space areas</title> |
401 | 393 | ||
402 | <para>Each address space consists of mutually disjunctive continuous |
394 | <para>Each address space consists of mutually disjunctive continuous |
403 | address space areas. Address space area is precisely defined by its |
395 | address space areas. Address space area is precisely defined by its |
404 | base address and the number of frames/pages is contains.</para> |
396 | base address and the number of frames/pages is contains.</para> |
405 | 397 | ||
406 | <para>Address space area , that define behaviour and permissions on |
398 | <para>Address space area , that define behaviour and permissions on |
407 | the particular area. <itemizedlist> |
399 | the particular area. <itemizedlist> |
408 | <listitem> |
400 | <listitem> |
409 | |
401 | |
410 | 402 | ||
411 | <emphasis>AS_AREA_READ</emphasis> |
403 | <emphasis>AS_AREA_READ</emphasis> |
412 | 404 | ||
413 | flag indicates reading permission. |
405 | flag indicates reading permission. |
414 | </listitem> |
406 | </listitem> |
415 | 407 | ||
416 | <listitem> |
408 | <listitem> |
417 | |
409 | |
418 | 410 | ||
419 | <emphasis>AS_AREA_WRITE</emphasis> |
411 | <emphasis>AS_AREA_WRITE</emphasis> |
420 | 412 | ||
421 | flag indicates writing permission. |
413 | flag indicates writing permission. |
422 | </listitem> |
414 | </listitem> |
423 | 415 | ||
424 | <listitem> |
416 | <listitem> |
425 | |
417 | |
426 | 418 | ||
427 | <emphasis>AS_AREA_EXEC</emphasis> |
419 | <emphasis>AS_AREA_EXEC</emphasis> |
428 | 420 | ||
429 | flag indicates code execution permission. Some architectures do not support execution persmission restriction. In this case this flag has no effect. |
421 | flag indicates code execution permission. Some architectures do not support execution persmission restriction. In this case this flag has no effect. |
430 | </listitem> |
422 | </listitem> |
431 | 423 | ||
432 | <listitem> |
424 | <listitem> |
433 | |
425 | |
434 | 426 | ||
435 | <emphasis>AS_AREA_DEVICE</emphasis> |
427 | <emphasis>AS_AREA_DEVICE</emphasis> |
436 | 428 | ||
437 | marks area as mapped to the device memory. |
429 | marks area as mapped to the device memory. |
438 | </listitem> |
430 | </listitem> |
439 | </itemizedlist></para> |
431 | </itemizedlist></para> |
440 | 432 | ||
441 | <para>Kernel provides possibility tasks create/expand/shrink/share its |
433 | <para>Kernel provides possibility tasks create/expand/shrink/share its |
442 | address space via the set of syscalls.</para> |
434 | address space via the set of syscalls.</para> |
443 | </section> |
435 | </section> |
444 | 436 | ||
445 | <section> |
437 | <section> |
446 | <title>Address Space ID (ASID)</title> |
438 | <title>Address Space ID (ASID)</title> |
447 | 439 | ||
448 | <para>When switching to the different task, kernel also require to |
440 | <para>When switching to the different task, kernel also require to |
449 | switch mappings to the different address space. In case TLB cannot |
441 | switch mappings to the different address space. In case TLB cannot |
450 | distinguish address space mappings, all mapping information in TLB |
442 | distinguish address space mappings, all mapping information in TLB |
451 | from the old address space must be flushed, which can create certain |
443 | from the old address space must be flushed, which can create certain |
452 | uncessary overhead during the task switching. To avoid this, some |
444 | uncessary overhead during the task switching. To avoid this, some |
453 | architectures have capability to segregate different address spaces on |
445 | architectures have capability to segregate different address spaces on |
454 | hardware level introducing the address space identifier as a part of |
446 | hardware level introducing the address space identifier as a part of |
455 | TLB record, telling the virtual address space translation unit to |
447 | TLB record, telling the virtual address space translation unit to |
456 | which address space this record is applicable.</para> |
448 | which address space this record is applicable.</para> |
457 | 449 | ||
458 | <para>HelenOS kernel can take advantage of this hardware supported |
450 | <para>HelenOS kernel can take advantage of this hardware supported |
459 | identifier by having an ASID abstraction which is somehow related to |
451 | identifier by having an ASID abstraction which is somehow related to |
460 | the corresponding architecture identifier. I.e. on ia64 kernel ASID is |
452 | the corresponding architecture identifier. I.e. on ia64 kernel ASID is |
461 | derived from RID (region identifier) and on the mips32 kernel ASID is |
453 | derived from RID (region identifier) and on the mips32 kernel ASID is |
462 | actually the hardware identifier. As expected, this ASID information |
454 | actually the hardware identifier. As expected, this ASID information |
463 | record is the part of <emphasis>as_t</emphasis> structure.</para> |
455 | record is the part of <emphasis>as_t</emphasis> structure.</para> |
464 | 456 | ||
465 | <para>Due to the hardware limitations, hardware ASID has limited |
457 | <para>Due to the hardware limitations, hardware ASID has limited |
466 | length from 8 bits on ia64 to 24 bits on mips32, which makes it |
458 | length from 8 bits on ia64 to 24 bits on mips32, which makes it |
467 | impossible to use it as unique address space identifier for all tasks |
459 | impossible to use it as unique address space identifier for all tasks |
468 | running in the system. In such situations special ASID stealing |
460 | running in the system. In such situations special ASID stealing |
469 | algoritm is used, which takes ASID from inactive task and assigns it |
461 | algoritm is used, which takes ASID from inactive task and assigns it |
470 | to the active task.</para> |
462 | to the active task.</para> |
471 | 463 | ||
472 | <para><classname>ASID stealing algoritm here.</classname></para> |
464 | <para><classname>ASID stealing algoritm here.</classname></para> |
473 | </section> |
465 | </section> |
474 | </section> |
466 | </section> |
475 | 467 | ||
476 | <section> |
468 | <section> |
477 | <title>Virtual address translation</title> |
469 | <title>Virtual address translation</title> |
478 | 470 | ||
479 | <section id="page_tables"> |
471 | <section id="page_tables"> |
480 | <title>Page tables</title> |
472 | <title>Page tables</title> |
481 | 473 | ||
482 | <para>HelenOS kernel has two different approaches to the paging |
474 | <para>HelenOS kernel has two different approaches to the paging |
483 | implementation: <emphasis>4 level page tables</emphasis> and |
475 | implementation: <emphasis>4 level page tables</emphasis> and |
484 | <emphasis>global hash tables</emphasis>, which are accessible via |
476 | <emphasis>global hash tables</emphasis>, which are accessible via |
485 | generic paging abstraction layer. Such different functionality was |
477 | generic paging abstraction layer. Such different functionality was |
486 | caused by the major architectural differences between supported |
478 | caused by the major architectural differences between supported |
487 | platforms. This abstraction is implemented with help of the global |
479 | platforms. This abstraction is implemented with help of the global |
488 | structure of pointers to basic mapping functions |
480 | structure of pointers to basic mapping functions |
489 | <emphasis>page_mapping_operations</emphasis>. To achieve different |
481 | <emphasis>page_mapping_operations</emphasis>. To achieve different |
490 | functionality of page tables, corresponding layer must implement |
482 | functionality of page tables, corresponding layer must implement |
491 | functions, declared in |
483 | functions, declared in |
492 | <emphasis>page_mapping_operations</emphasis></para> |
484 | <emphasis>page_mapping_operations</emphasis></para> |
493 | 485 | ||
494 | <formalpara> |
486 | <formalpara> |
495 | <title>4-level page tables</title> |
487 | <title>4-level page tables</title> |
496 | 488 | ||
497 | <para>4-level page tables are the generalization of the hardware |
489 | <para>4-level page tables are the generalization of the hardware |
498 | capabilities of several architectures.<itemizedlist> |
490 | capabilities of several architectures.<itemizedlist> |
499 | <listitem> |
491 | <listitem> |
500 | ia32 uses 2-level page tables, with full hardware support. |
492 | ia32 uses 2-level page tables, with full hardware support. |
501 | </listitem> |
493 | </listitem> |
502 | 494 | ||
503 | <listitem> |
495 | <listitem> |
504 | amd64 uses 4-level page tables, also coming with full hardware support. |
496 | amd64 uses 4-level page tables, also coming with full hardware support. |
505 | </listitem> |
497 | </listitem> |
506 | 498 | ||
507 | <listitem> |
499 | <listitem> |
508 | mips and ppc32 have 2-level tables, software simulated support. |
500 | mips and ppc32 have 2-level tables, software simulated support. |
509 | </listitem> |
501 | </listitem> |
510 | </itemizedlist></para> |
502 | </itemizedlist></para> |
511 | </formalpara> |
503 | </formalpara> |
512 | 504 | ||
513 | <formalpara> |
505 | <formalpara> |
514 | <title>Global hash tables</title> |
506 | <title>Global hash tables</title> |
515 | 507 | ||
516 | <para>- global page hash table: existuje jen jedna v celem systemu |
508 | <para>- global page hash table: existuje jen jedna v celem systemu |
517 | (vyuziva ji ia64), pozn. ia64 ma zatim vypnuty VHPT. Pouziva se |
509 | (vyuziva ji ia64), pozn. ia64 ma zatim vypnuty VHPT. Pouziva se |
518 | genericke hash table s oddelenymi collision chains. ASID support is |
510 | genericke hash table s oddelenymi collision chains. ASID support is |
519 | required to use global hash tables.</para> |
511 | required to use global hash tables.</para> |
520 | </formalpara> |
512 | </formalpara> |
521 | 513 | ||
522 | <para>Thanks to the abstract paging interface, there is possibility |
514 | <para>Thanks to the abstract paging interface, there is possibility |
523 | left have more paging implementations, for example B-Tree page |
515 | left have more paging implementations, for example B-Tree page |
524 | tables.</para> |
516 | tables.</para> |
525 | </section> |
517 | </section> |
526 | 518 | ||
527 | <section id="tlb"> |
519 | <section id="tlb"> |
528 | <title>Translation Lookaside buffer</title> |
520 | <title>Translation Lookaside buffer</title> |
529 | 521 | ||
530 | <para>Due to the extensive overhead during the page mapping lookup in |
522 | <para>Due to the extensive overhead during the page mapping lookup in |
531 | the page tables, all architectures has fast assotiative cache memory |
523 | the page tables, all architectures has fast assotiative cache memory |
532 | built-in CPU. This memory called TLB stores recently used page table |
524 | built-in CPU. This memory called TLB stores recently used page table |
533 | entries.</para> |
525 | entries.</para> |
534 | 526 | ||
535 | <section id="tlb_shootdown"> |
527 | <section id="tlb_shootdown"> |
536 | <title>TLB consistency. TLB shootdown algorithm.</title> |
528 | <title>TLB consistency. TLB shootdown algorithm.</title> |
537 | 529 | ||
538 | <para>Operating system is responsible for keeping TLB consistent by |
530 | <para>Operating system is responsible for keeping TLB consistent by |
539 | invalidating the contents of TLB, whenever there is some change in |
531 | invalidating the contents of TLB, whenever there is some change in |
540 | page tables. Those changes may occur when page or group of pages |
532 | page tables. Those changes may occur when page or group of pages |
541 | were unmapped, mapping is changed or system switching active address |
533 | were unmapped, mapping is changed or system switching active address |
542 | space to schedule a new system task (which is a batch unmap of all |
534 | space to schedule a new system task (which is a batch unmap of all |
543 | address space mappings). Moreover, this invalidation operation must |
535 | address space mappings). Moreover, this invalidation operation must |
544 | be done an all system CPUs because each CPU has its own independent |
536 | be done an all system CPUs because each CPU has its own independent |
545 | TLB cache. Thus maintaining TLB consistency on SMP configuration as |
537 | TLB cache. Thus maintaining TLB consistency on SMP configuration as |
546 | not as trivial task as it looks at the first glance. Naive solution |
538 | not as trivial task as it looks at the first glance. Naive solution |
547 | would assume remote TLB invalidatation, which is not possible on the |
539 | would assume remote TLB invalidatation, which is not possible on the |
548 | most of the architectures, because of the simple fact - flushing TLB |
540 | most of the architectures, because of the simple fact - flushing TLB |
549 | is allowed only on the local CPU and there is no possibility to |
541 | is allowed only on the local CPU and there is no possibility to |
550 | access other CPUs' TLB caches.</para> |
542 | access other CPUs' TLB caches.</para> |
551 | 543 | ||
552 | <para>Technique of remote invalidation of TLB entries is called "TLB |
544 | <para>Technique of remote invalidation of TLB entries is called "TLB |
553 | shootdown". HelenOS uses a variation of the algorithm described by |
545 | shootdown". HelenOS uses a variation of the algorithm described by |
554 | D. Black et al., "Translation Lookaside Buffer Consistency: A |
546 | D. Black et al., "Translation Lookaside Buffer Consistency: A |
555 | Software Approach," Proc. Third Int'l Conf. Architectural Support |
547 | Software Approach," Proc. Third Int'l Conf. Architectural Support |
556 | for Programming Languages and Operating Systems, 1989, pp. |
548 | for Programming Languages and Operating Systems, 1989, pp. |
557 | 113-122.</para> |
549 | 113-122.</para> |
558 | 550 | ||
559 | <para>As the situation demands, you will want partitial invalidation |
551 | <para>As the situation demands, you will want partitial invalidation |
560 | of TLB caches. In case of simple memory mapping change it is |
552 | of TLB caches. In case of simple memory mapping change it is |
561 | necessary to invalidate only one or more adjacent pages. In case if |
553 | necessary to invalidate only one or more adjacent pages. In case if |
562 | the architecture is aware of ASIDs, during the address space |
554 | the architecture is aware of ASIDs, during the address space |
563 | switching, kernel invalidates only entries from this particular |
555 | switching, kernel invalidates only entries from this particular |
564 | address space. Final option of the TLB invalidation is the complete |
556 | address space. Final option of the TLB invalidation is the complete |
565 | TLB cache invalidation, which is the operation that flushes all |
557 | TLB cache invalidation, which is the operation that flushes all |
566 | entries in TLB.</para> |
558 | entries in TLB.</para> |
567 | 559 | ||
568 | <para>TLB shootdown is performed in two phases. First, the initiator |
560 | <para>TLB shootdown is performed in two phases. First, the initiator |
569 | process sends an IPI message indicating the TLB shootdown request to |
561 | process sends an IPI message indicating the TLB shootdown request to |
570 | the rest of the CPUs. Then, it waits until all CPUs confirm TLB |
562 | the rest of the CPUs. Then, it waits until all CPUs confirm TLB |
571 | invalidating action execution.</para> |
563 | invalidating action execution.</para> |
572 | </section> |
564 | </section> |
573 | </section> |
565 | </section> |
574 | </section> |
566 | </section> |
575 | 567 | ||
576 | <section> |
568 | <section> |
577 | <title>---</title> |
569 | <title>---</title> |
578 | 570 | ||
579 | <para>At the moment HelenOS does not support swapping.</para> |
571 | <para>At the moment HelenOS does not support swapping.</para> |
580 | 572 | ||
581 | <para>- pouzivame vypadky stranky k alokaci ramcu on-demand v ramci |
573 | <para>- pouzivame vypadky stranky k alokaci ramcu on-demand v ramci |
582 | as_area - na architekturach, ktere to podporuji, podporujeme non-exec |
574 | as_area - na architekturach, ktere to podporuji, podporujeme non-exec |
583 | stranky</para> |
575 | stranky</para> |
584 | </section> |
576 | </section> |
585 | </section> |
577 | </section> |
586 | </chapter> |
578 | </chapter> |