Rev 67 | Rev 69 | Go to most recent revision | Show entire file | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed
| Rev 67 | Rev 68 | ||
|---|---|---|---|
| Line 220... | Line 220... | ||
| 220 | 220 | ||
| 221 | <para>The slab allocator is closely modelled after OpenSolaris slab |
221 | <para>The slab allocator is closely modelled after OpenSolaris slab |
| 222 | allocator by Jeff Bonwick and Jonathan Adams with the following |
222 | allocator by Jeff Bonwick and Jonathan Adams with the following |
| 223 | exceptions:<itemizedlist> |
223 | exceptions:<itemizedlist> |
| 224 | <listitem> |
224 | <listitem> |
| 225 | empty slabs are immediately deallocated |
225 | empty slabs are immediately deallocated |
| 226 | </listitem> |
226 | </listitem> |
| 227 | 227 | ||
| 228 | <listitem> |
228 | <listitem> |
| 229 | <para>empty magazines are deallocated when not needed</para> |
229 | <para>empty magazines are deallocated when not needed</para> |
| 230 | </listitem> |
230 | </listitem> |
| Line 363... | Line 363... | ||
| 363 | </listitem> |
363 | </listitem> |
| 364 | </itemizedlist></para> |
364 | </itemizedlist></para> |
| 365 | 365 | ||
| 366 | <para><!-- |
366 | <para><!-- |
| 367 | 367 | ||
| 368 | TLB shootdown ASID/ASID:PAGE/ALL. |
- | |
| 369 | TLB shootdown requests can come in asynchroniously |
- | |
| 370 | so there is a cache of TLB shootdown requests. Upon cache overflow TLB shootdown ALL is executed |
- | |
| 371 | - | ||
| 372 | - | ||
| 373 | <para> |
368 | <para> |
| 374 | Address spaces. Address space area (B+ tree). Only for uspace. Set of syscalls (shrink/extend etc). |
369 | Address spaces. Address space area (B+ tree). Only for uspace. Set of syscalls (shrink/extend etc). |
| 375 | Special address space area type - device - prohibits shrink/extend syscalls to call on it. |
370 | Special address space area type - device - prohibits shrink/extend syscalls to call on it. |
| 376 | Address space has link to mapping tables (hierarchical - per Address space, hash - global tables). |
371 | Address space has link to mapping tables (hierarchical - per Address space, hash - global tables). |
| 377 | </para> |
372 | </para> |
| 378 | 373 | ||
| 379 | --></para> |
374 | --></para> |
| 380 | </section> |
375 | </section> |
| 381 | 376 | ||
| 382 | <section> |
377 | <section> |
| 383 | <title>Paging</title> |
- | |
| 384 | - | ||
| 385 | <para>Virtual memory is usually using paged memory model, where virtual |
- | |
| 386 | memory address space is divided into the <emphasis>pages</emphasis> |
- | |
| 387 | (usually having size 4096 bytes) and physical memory is divided into the |
- | |
| 388 | frames (same sized as a page, of course). Each page may be mapped to |
- | |
| 389 | some frame and then, upon memory access to the virtual address, CPU |
- | |
| 390 | performs <emphasis>address translation</emphasis> during the instruction |
- | |
| 391 | execution. Non-existing mapping generates page fault exception, calling |
- | |
| 392 | kernel exception handler, thus allowing kernel to manipulate rules of |
- | |
| 393 | memory access. Information for pages mapping is stored by kernel in the |
- | |
| 394 | <link linkend="page_tables">page tables</link></para> |
- | |
| 395 | - | ||
| 396 | <para>The majority of the architectures use multi-level page tables, |
- | |
| 397 | which means need to access physical memory several times before getting |
- | |
| 398 | physical address. This fact would make serios performance overhead in |
- | |
| 399 | virtual memory management. To avoid this <link linkend="tlb">Traslation |
- | |
| 400 | Lookaside Buffer (TLB)</link> is used.</para> |
- | |
| 401 | </section> |
- | |
| 402 | - | ||
| 403 | <section> |
- | |
| 404 | <title>Address spaces</title> |
378 | <title>Address spaces</title> |
| 405 | 379 | ||
| 406 | <section> |
380 | <section> |
| 407 | <title>Address space areas</title> |
381 | <title>Address space areas</title> |
| 408 | 382 | ||
| Line 481... | Line 455... | ||
| 481 | </section> |
455 | </section> |
| 482 | 456 | ||
| 483 | <section> |
457 | <section> |
| 484 | <title>Virtual address translation</title> |
458 | <title>Virtual address translation</title> |
| 485 | 459 | ||
| 486 | <section id="page_tables"> |
460 | <section id="paging"> |
| 487 | <title>Page tables</title> |
461 | <title>Paging</title> |
| 488 | 462 | ||
| 489 | <para>HelenOS kernel has two different approaches to the paging |
- | |
| 490 | implementation: <emphasis>4 level page tables</emphasis> and |
- | |
| 491 | <emphasis>global hash tables</emphasis>, which are accessible via |
- | |
| 492 | generic paging abstraction layer. Such different functionality was |
- | |
| 493 | caused by the major architectural differences between supported |
- | |
| 494 | platforms. This abstraction is implemented with help of the global |
- | |
| 495 | structure of pointers to basic mapping functions |
- | |
| 496 | <emphasis>page_mapping_operations</emphasis>. To achieve different |
- | |
| 497 | functionality of page tables, corresponding layer must implement |
- | |
| 498 | functions, declared in |
463 | <section> |
| 499 | <emphasis>page_mapping_operations</emphasis></para> |
464 | <title>Introduction</title> |
| 500 | 465 | ||
| - | 466 | <para>Virtual memory is usually using paged memory model, where |
|
| - | 467 | virtual memory address space is divided into the |
|
| - | 468 | <emphasis>pages</emphasis> (usually having size 4096 bytes) and |
|
| - | 469 | physical memory is divided into the frames (same sized as a page, of |
|
| - | 470 | course). Each page may be mapped to some frame and then, upon memory |
|
| - | 471 | access to the virtual address, CPU performs <emphasis>address |
|
| - | 472 | translation</emphasis> during the instruction execution. |
|
| - | 473 | Non-existing mapping generates page fault exception, calling kernel |
|
| - | 474 | exception handler, thus allowing kernel to manipulate rules of |
|
| - | 475 | memory access. Information for pages mapping is stored by kernel in |
|
| - | 476 | the <link linkend="page_tables">page tables</link></para> |
|
| - | 477 | ||
| - | 478 | <para>The majority of the architectures use multi-level page tables, |
|
| - | 479 | which means need to access physical memory several times before |
|
| - | 480 | getting physical address. This fact would make serios performance |
|
| - | 481 | overhead in virtual memory management. To avoid this <link |
|
| - | 482 | linkend="tlb">Traslation Lookaside Buffer (TLB)</link> is |
|
| 501 | <formalpara> |
483 | used.</para> |
| - | 484 | ||
| - | 485 | <para>HelenOS kernel has two different approaches to the paging |
|
| - | 486 | implementation: <emphasis>4 level page tables</emphasis> and |
|
| - | 487 | <emphasis>global hash table</emphasis>, which are accessible via |
|
| - | 488 | generic paging abstraction layer. Such different functionality was |
|
| - | 489 | caused by the major architectural differences between supported |
|
| - | 490 | platforms. This abstraction is implemented with help of the global |
|
| - | 491 | structure of pointers to basic mapping functions |
|
| - | 492 | <emphasis>page_mapping_operations</emphasis>. To achieve different |
|
| - | 493 | functionality of page tables, corresponding layer must implement |
|
| - | 494 | functions, declared in |
|
| - | 495 | <emphasis>page_mapping_operations</emphasis></para> |
|
| - | 496 | ||
| - | 497 | <para>Thanks to the abstract paging interface, there was a place |
|
| - | 498 | left for more paging implementations (besides already implemented |
|
| - | 499 | hieararchical page tables and hash table), for example B-Tree based |
|
| - | 500 | page tables.</para> |
|
| - | 501 | </section> |
|
| - | 502 | ||
| - | 503 | <section> |
|
| 502 | <title>4-level page tables</title> |
504 | <title>Hierarchical 4-level page tables</title> |
| 503 | 505 | ||
| 504 | <para>4-level page tables are the generalization of the hardware |
506 | <para>Hierarchical 4-level page tables are the generalization of the |
| - | 507 | hardware capabilities of most architectures. Each address space has |
|
| 505 | capabilities of several architectures.<itemizedlist> |
508 | its own page tables.<itemizedlist> |
| 506 | <listitem> |
509 | <listitem> |
| 507 | ia32 uses 2-level page tables, with full hardware support. |
510 | ia32 uses 2-level page tables, with full hardware support. |
| 508 | </listitem> |
511 | </listitem> |
| 509 | 512 | ||
| 510 | <listitem> |
513 | <listitem> |
| Line 513... | Line 516... | ||
| 513 | 516 | ||
| 514 | <listitem> |
517 | <listitem> |
| 515 | mips and ppc32 have 2-level tables, software simulated support. |
518 | mips and ppc32 have 2-level tables, software simulated support. |
| 516 | </listitem> |
519 | </listitem> |
| 517 | </itemizedlist></para> |
520 | </itemizedlist></para> |
| 518 | </formalpara> |
521 | </section> |
| - | 522 | ||
| - | 523 | <section> |
|
| - | 524 | <title>Global hash table</title> |
|
| 519 | 525 | ||
| - | 526 | <para>Implementation of the global hash table was encouraged by the |
|
| - | 527 | ia64 architecture support. One of the major differences between |
|
| - | 528 | global hash table and hierarchical tables is that global hash table |
|
| - | 529 | exists only once in the system and the hierarchical tables are |
|
| 520 | <formalpara> |
530 | maintained per address space.</para> |
| - | 531 | ||
| - | 532 | <para>Thus, hash table contains information about all address spaces |
|
| - | 533 | mappings in the system, so, the hash of an entry must contain |
|
| - | 534 | information of both address space pointer or id and the virtual |
|
| - | 535 | address of the page. Generic hash table implementation assumes that |
|
| - | 536 | the addresses of the pointers to the address spaces are likely to be |
|
| - | 537 | on the close addresses, so it uses least significant bits for hash; |
|
| - | 538 | also it assumes that the virtual page addresses have roughly the |
|
| - | 539 | same probability of occurring, so the least significant bits of VPN |
|
| 521 | <title>Global hash tables</title> |
540 | compose the hash index.</para> |
| 522 | 541 | ||
| 523 | <para>- global page hash table: existuje jen jedna v celem systemu |
- | |
| 524 | (vyuziva ji ia64), pozn. ia64 ma zatim vypnuty VHPT. Pouziva se |
- | |
| 525 | genericke hash table s oddelenymi collision chains. ASID support is |
- | |
| 526 | required to use global hash tables.</para> |
542 | <para>Collision chains ...</para> |
| 527 | </formalpara> |
543 | </section> |
| 528 | - | ||
| 529 | <para>Thanks to the abstract paging interface, there is possibility |
- | |
| 530 | left have more paging implementations, for example B-Tree page |
- | |
| 531 | tables.</para> |
- | |
| 532 | </section> |
544 | </section> |
| 533 | 545 | ||
| 534 | <section id="tlb"> |
546 | <section id="tlb"> |
| 535 | <title>Translation Lookaside buffer</title> |
547 | <title>Translation Lookaside buffer</title> |
| 536 | 548 | ||
| Line 544... | Line 556... | ||
| 544 | 556 | ||
| 545 | <para>Operating system is responsible for keeping TLB consistent by |
557 | <para>Operating system is responsible for keeping TLB consistent by |
| 546 | invalidating the contents of TLB, whenever there is some change in |
558 | invalidating the contents of TLB, whenever there is some change in |
| 547 | page tables. Those changes may occur when page or group of pages |
559 | page tables. Those changes may occur when page or group of pages |
| 548 | were unmapped, mapping is changed or system switching active address |
560 | were unmapped, mapping is changed or system switching active address |
| 549 | space to schedule a new system task (which is a batch unmap of all |
561 | space to schedule a new system task. Moreover, this invalidation |
| 550 | address space mappings). Moreover, this invalidation operation must |
562 | operation must be done an all system CPUs because each CPU has its |
| 551 | be done an all system CPUs because each CPU has its own independent |
563 | own independent TLB cache. Thus maintaining TLB consistency on SMP |
| 552 | TLB cache. Thus maintaining TLB consistency on SMP configuration as |
564 | configuration as not as trivial task as it looks on the first |
| 553 | not as trivial task as it looks at the first glance. Naive solution |
565 | glance. Naive solution would assume that is the CPU which wants to |
| 554 | would assume remote TLB invalidatation, which is not possible on the |
566 | invalidate TLB will invalidate TLB caches on other CPUs. It is not |
| 555 | most of the architectures, because of the simple fact - flushing TLB |
567 | possible on the most of the architectures, because of the simple |
| 556 | is allowed only on the local CPU and there is no possibility to |
568 | fact - flushing TLB is allowed only on the local CPU and there is no |
| 557 | access other CPUs' TLB caches.</para> |
569 | possibility to access other CPUs' TLB caches, thus invalidate TLB |
| - | 570 | remotely.</para> |
|
| 558 | 571 | ||
| 559 | <para>Technique of remote invalidation of TLB entries is called "TLB |
572 | <para>Technique of remote invalidation of TLB entries is called "TLB |
| 560 | shootdown". HelenOS uses a variation of the algorithm described by |
573 | shootdown". HelenOS uses a variation of the algorithm described by |
| 561 | D. Black et al., "Translation Lookaside Buffer Consistency: A |
574 | D. Black et al., "Translation Lookaside Buffer Consistency: A |
| 562 | Software Approach," Proc. Third Int'l Conf. Architectural Support |
575 | Software Approach," Proc. Third Int'l Conf. Architectural Support |
| Line 564... | Line 577... | ||
| 564 | 113-122.</para> |
577 | 113-122.</para> |
| 565 | 578 | ||
| 566 | <para>As the situation demands, you will want partitial invalidation |
579 | <para>As the situation demands, you will want partitial invalidation |
| 567 | of TLB caches. In case of simple memory mapping change it is |
580 | of TLB caches. In case of simple memory mapping change it is |
| 568 | necessary to invalidate only one or more adjacent pages. In case if |
581 | necessary to invalidate only one or more adjacent pages. In case if |
| 569 | the architecture is aware of ASIDs, during the address space |
582 | the architecture is aware of ASIDs, when kernel needs to dump some |
| 570 | switching, kernel invalidates only entries from this particular |
583 | ASID to use by another task, it invalidates only entries from this |
| 571 | address space. Final option of the TLB invalidation is the complete |
584 | particular address space. Final option of the TLB invalidation is |
| 572 | TLB cache invalidation, which is the operation that flushes all |
585 | the complete TLB cache invalidation, which is the operation that |
| 573 | entries in TLB.</para> |
586 | flushes all entries in TLB.</para> |
| 574 | - | ||
| 575 | <para>TLB shootdown is performed in two phases. First, the initiator |
- | |
| 576 | process sends an IPI message indicating the TLB shootdown request to |
- | |
| 577 | the rest of the CPUs. Then, it waits until all CPUs confirm TLB |
- | |
| 578 | invalidating action execution.</para> |
- | |
| 579 | </section> |
- | |
| 580 | </section> |
- | |
| 581 | </section> |
- | |
| 582 | 587 | ||
| 583 | <section> |
- | |
| 584 | <title>---</title> |
588 | <para>TLB shootdown is performed in two phases.</para> |
| 585 | 589 | ||
| - | 590 | <formalpara> |
|
| 586 | <para>At the moment HelenOS does not support swapping.</para> |
591 | <title>Phase 1.</title> |
| 587 | 592 | ||
| 588 | <para>- pouzivame vypadky stranky k alokaci ramcu on-demand v ramci |
593 | <para>First, initiator locks a global TLB spinlock, then request |
| - | 594 | is being put to the local request cache of every other CPU in the |
|
| - | 595 | system protected by its spinlock. In case the cache is full, all |
|
| 589 | as_area - na architekturach, ktere to podporuji, podporujeme non-exec |
596 | requests in the cache are replaced by one request, indicating |
| - | 597 | global TLB flush. Then the initiator thread sends an IPI message |
|
| - | 598 | indicating the TLB shootdown request to the rest of the CPUs and |
|
| - | 599 | waits actively until all CPUs confirm TLB invalidating action |
|
| - | 600 | execution by setting up a special flag. After setting this flag |
|
| - | 601 | this thread is blocked on the TLB spinlock, held by the |
|
| - | 602 | initiator.</para> |
|
| - | 603 | </formalpara> |
|
| - | 604 | ||
| - | 605 | <formalpara> |
|
| - | 606 | <title>Phase 2.</title> |
|
| - | 607 | ||
| - | 608 | <para>All CPUs are waiting on the TLB spinlock to execute TLB |
|
| - | 609 | invalidation action and have indicated their intention to the |
|
| - | 610 | initiator. Initiator continues, cleaning up its TLB and releasing |
|
| - | 611 | the global TLB spinlock. After this all other CPUs gain and |
|
| - | 612 | immidiately release TLB spinlock and perform TLB invalidation |
|
| 590 | stranky</para> |
613 | actions.</para> |
| - | 614 | </formalpara> |
|
| - | 615 | </section> |
|
| - | 616 | </section> |
|
| 591 | </section> |
617 | </section> |
| 592 | </section> |
618 | </section> |
| 593 | </chapter> |
619 | </chapter> |
| 594 | 620 | ||