267,8 → 267,7 |
<section> |
<title>Implementation</title> |
|
<para>The slab allocator is closely modelled after OpenSolaris slab |
allocator by Jeff Bonwick and Jonathan Adams <xref |
<para>The slab allocator is closely modelled after <xref |
linkend="Bonwick01" /> with the following exceptions:<itemizedlist> |
<listitem> |
<para>empty slabs are immediately deallocated and</para> |
489,11 → 488,12 |
there is only a single global page hash table in the system while |
hierarchical page tables exist per address space. Thus, the global |
page hash table contains information about mappings of all address |
spaces in the system. </para> |
spaces in the system.</para> |
|
<para>The global page hash table mechanism uses the generic hash table |
type as described in the chapter about <link linkend="hashtables">data |
structures</link> earlier in this book.</para> |
type as described in the chapter dedicated to <link |
linkend="hashtables">data structures</link> earlier in this |
book.</para> |
</section> |
</section> |
</section> |
505,10 → 505,13 |
|
<title>Translation Lookaside buffer</title> |
|
<para>Due to the extensive overhead during the page mapping lookup in the |
page tables, all architectures has fast assotiative cache memory built-in |
CPU. This memory called TLB stores recently used page table |
entries.</para> |
<para>Due to the extensive overhead of several extra memory accesses |
during page table lookup that are necessary on every instruction, modern |
architectures deploy fast assotiative cache of recelntly used page |
mappings. This cache is called TLB - Translation Lookaside Buffer - and is |
present on every processor in the system. As it has been already pointed |
out, TLB is the only page translation mechanism for some |
architectures.</para> |
|
<section id="tlb_shootdown"> |
<indexterm> |
517,63 → 520,57 |
<secondary>- TLB shootdown</secondary> |
</indexterm> |
|
<title>TLB consistency. TLB shootdown algorithm.</title> |
<title>TLB consistency</title> |
|
<para>Operating system is responsible for keeping TLB consistent by |
invalidating the contents of TLB, whenever there is some change in page |
tables. Those changes may occur when page or group of pages were |
unmapped, mapping is changed or system switching active address space to |
schedule a new system task. Moreover, this invalidation operation must |
be done an all system CPUs because each CPU has its own independent TLB |
cache. Thus maintaining TLB consistency on SMP configuration as not as |
trivial task as it looks on the first glance. Naive solution would |
assume that is the CPU which wants to invalidate TLB will invalidate TLB |
caches on other CPUs. It is not possible on the most of the |
architectures, because of the simple fact - flushing TLB is allowed only |
on the local CPU and there is no possibility to access other CPUs' TLB |
caches, thus invalidate TLB remotely.</para> |
<para>The operating system is responsible for keeping TLB consistent |
with the page tables. Whenever mappings are modified or purged from the |
page tables, or when an address space identifier is reused, the kernel |
needs to invalidate the respective contents of TLB. Some TLB types |
support partial invalidation of their content (e.g. ranges of pages or |
address spaces) while other types can be invalidated only entirely. The |
invalidation must be done on all processors for there is one TLB per |
processor. Maintaining TLB consistency on multiprocessor configurations |
is not as trivial as it might look from the first glance.</para> |
|
<para>Technique of remote invalidation of TLB entries is called "TLB |
shootdown". HelenOS uses a variation of the algorithm described by D. |
Black et al., "Translation Lookaside Buffer Consistency: A Software |
Approach," Proc. Third Int'l Conf. Architectural Support for Programming |
Languages and Operating Systems, 1989, pp. 113-122. <xref |
linkend="Black89" /></para> |
<para>The remote TLB invalidation is called TLB shootdown. HelenOS uses |
a simplified variant of the algorithm described in <xref |
linkend="Black89" />. </para> |
|
<para>As the situation demands, you will want partitial invalidation of |
TLB caches. In case of simple memory mapping change it is necessary to |
invalidate only one or more adjacent pages. In case if the architecture |
is aware of ASIDs, when kernel needs to dump some ASID to use by another |
task, it invalidates only entries from this particular address space. |
Final option of the TLB invalidation is the complete TLB cache |
invalidation, which is the operation that flushes all entries in |
TLB.</para> |
<para>TLB shootdown is performed in three phases.</para> |
|
<para>TLB shootdown is performed in two phases.</para> |
|
<formalpara> |
<title>Phase 1.</title> |
|
<para>First, initiator locks a global TLB spinlock, then request is |
being put to the local request cache of every other CPU in the system |
protected by its spinlock. In case the cache is full, all requests in |
the cache are replaced by one request, indicating global TLB flush. |
Then the initiator thread sends an IPI message indicating the TLB |
shootdown request to the rest of the CPUs and waits actively until all |
CPUs confirm TLB invalidating action execution by setting up a special |
flag. After setting this flag this thread is blocked on the TLB |
spinlock, held by the initiator.</para> |
<para>The initiator clears its TLB flag and locks the global TLB |
spinlock. The request is then enqueued into all other processors' TLB |
shootdown message queues. When the TLB shootdown message queue is full |
on any processor, the queue is purged and a single request to |
invalidate the entire TLB is stored there. Once all the TLB shootdown |
messages were dispatched, the initiator sends all other processors an |
interrupt to notify them about the incoming TLB shootdown message. It |
then spins until all processors accept the interrupt and clear their |
TLB flags.</para> |
</formalpara> |
|
<formalpara> |
<title>Phase 2.</title> |
|
<para>All CPUs are waiting on the TLB spinlock to execute TLB |
invalidation action and have indicated their intention to the |
initiator. Initiator continues, cleaning up its TLB and releasing the |
global TLB spinlock. After this all other CPUs gain and immidiately |
release TLB spinlock and perform TLB invalidation actions.</para> |
<para>Except for the initiator, all other processors are spining on |
the TLB spinlock. The initiator is now free to modify the page tables |
and purge its own TLB. The initiator then unlocks the global TLB |
spinlock and sets its TLB flag.</para> |
</formalpara> |
|
<formalpara> |
<title>Phase 3.</title> |
|
<para>When the spinlock is unlocked by the initiator, other processors |
are sequentially granted the spinlock. However, once they manage to |
lock it, they immediately release it. Each processor invalidates its |
TLB according to messages found in its TLB shootdown message queue. In |
the end, each processor sets its TLB flag and resumes its previous |
operation.</para> |
</formalpara> |
</section> |
|
<section> |