Subversion Repositories HelenOS-doc

Rev

Rev 157 | Blame | Compare with Previous | Last modification | View Log | Download | RSS feed

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <appendix id="archspecs">
  3.   <?dbhtml filename="arch.html"?>
  4.  
  5.   <title>Architecture Specific Notes</title>
  6.  
  7.   <section>
  8.     <title>AMD64/Intel EM64T</title>
  9.  
  10.     <para>The amd64 architecture is a 64-bit extension of the older ia32
  11.     architecture. Only 64-bit applications are supported. Creating this port
  12.     was relatively easy, because it shares a lot of common code with ia32
  13.     platform. However, the 64-bit extension has some specifics, which made the
  14.     porting interesting.</para>
  15.  
  16.     <section>
  17.       <title>Virtual Memory</title>
  18.  
  19.       <para>The amd64 architecture uses standard processor defined 4-level
  20.       page mapping of 4KB pages. The NX(no-execute) flag on individual pages
  21.       is fully supported.</para>
  22.     </section>
  23.  
  24.     <section>
  25.       <title>TLB-only Paging</title>
  26.  
  27.       <para>All memory on the amd64 architecture is memory mapped, if the
  28.       kernel needs to access physical memory, a mapping must be created.
  29.       During boot process the boot loader creates mapping for the first 20MB
  30.       of physical memory. To correctly initialize the page mapping system, an
  31.       identity mapping of whole physical memory must be created. However, to
  32.       create the mapping it is unavoidable to allocate new - possibly unmapped
  33.       - frames from frame allocator. The ia32 solves it by mapping first 2GB
  34.       memory during boot process. The same solution on 64-bit platform becomes
  35.       unfeasible because of the size of the possible address space.</para>
  36.  
  37.       <para>As soon as the exception routines are initialized, a special page
  38.       fault exception handler is installed which provides a complete view of
  39.       physical memory until the real page mapping system is initialized. It
  40.       dynamically changes the page tables to always contain exactly the
  41.       faulting address. The page then becomes cached in the TLB and on the
  42.       next page fault the same tables can be utilized to handle another
  43.       mapping.</para>
  44.     </section>
  45.  
  46.     <section>
  47.       <title>Mapping of Physical Memory</title>
  48.  
  49.       <para>The amd64 ABI document describes several modes of program layout.
  50.       The operating system kernel should be compiled in a
  51.       <emphasis>kernel</emphasis> mode - the kernel is located in the negative
  52.       2 gigabytes (0xffffffff80000000-0xfffffffffffffffff) and can access data
  53.       anywhere in the 64-bit space. This wouldn't allow kernel to see directly
  54.      more than 2GB of physical memory. HelenOS duplicates the virtual mapping
  55.      of the physical memory starting at 0xffff800000000000 and accesses all
  56.      external references using this address range.</para>
  57.    </section>
  58.  
  59.    <section>
  60.      <title>Thread Local Storage</title>
  61.  
  62.      <para>The code accessing thread local storage uses a segment register FS
  63.      as a base. The thread local storage is stored in the hidden 64-bit part
  64.      of the FS register which must be written using priviledged machine
  65.      specific instructions. Special syscall to change this register is
  66.      provided to user applications. The TLS address for this platform is
  67.      expected to point just after the end of the thread local data. The
  68.      application sometimes need to get a real address of the thread local
  69.      data in its address space but it is impossible to read the base of the
  70.      FS segmentation register. The solution is to add the self-reference
  71.      address to the end of thread local data, so that the application can
  72.      read the address as %gs:0.</para>
  73.  
  74.      <figure float="1">
  75.        <title>IA-32 &amp; AMD64 TLD</title>
  76.  
  77.        <mediaobject id="tldia32">
  78.          <imageobject role="pdf">
  79.            <imagedata fileref="images/tld_ia32.pdf" format="PDF" />
  80.          </imageobject>
  81.  
  82.          <imageobject role="html">
  83.            <imagedata fileref="images/tld_ia32.png" format="PNG" />
  84.          </imageobject>
  85.  
  86.          <imageobject role="fop">
  87.            <imagedata fileref="images/tld_ia32.svg" format="SVG" />
  88.          </imageobject>
  89.        </mediaobject>
  90.      </figure>
  91.    </section>
  92.  
  93.    <section>
  94.      <title>Fast SYSCALL/SYSRET Support</title>
  95.  
  96.      <para>The entry point for system calls was traditionally a speed problem
  97.      on the ia32 architecture. The amd64 supports SYSCALL/SYSRET
  98.      instructions. Upon encountering the SYSCALL instruction, the processor
  99.      changes privilege mode and transfers control to an address stored in
  100.      machine specific register. Unlike other similar instructions it does not
  101.      change stack to a known kernel stack, which must be done by the syscall
  102.      entry routine. A hidden part of a GS register is provided to support the
  103.      entry routine with data needed for switching to kernel stack.</para>
  104.    </section>
  105.  
  106.    <section>
  107.      <title>Debugging Support</title>
  108.  
  109.      <para>To provide developers tools for finding bugs, hardware breakpoints
  110.      and watchpoints are supported. The kernel also supports self-debugging -
  111.      it sets watchpoints on certain data and upon every modification
  112.      automatically checks whether a correct value was written. It is
  113.      worthwhile to mention, that since this feature was implemented, the
  114.      watchpoint was never fired.</para>
  115.    </section>
  116.  </section>
  117.  
  118.  <section>
  119.    <title>Intel IA-32</title>
  120.  
  121.    <para>The ia32 architecture uses 4K pages and processor supported 2-level
  122.    page tables. Along with amd64, it is one of the two architectures that fully
  123.    support SMP configurations. The architecture is mostly similar to amd64,
  124.    it even shares a lot of code. The debugging support is the same as with
  125.    amd64. The thread local storage uses GS register.</para>
  126.  </section>
  127.  
  128.  <section>
  129.    <title>32-bit MIPS</title>
  130.  
  131.    <para>Both little and big endian kernels are supported. In order to test
  132.    different page sizes, the mips32 page size was set to 16K. The mips32
  133.    architecture is TLB-only, the kernel simulates 2-level page tables. On
  134.    processors that support it, lazy FPU context switching is
  135.    implemented.</para>
  136.  
  137.    <section>
  138.      <title>Thread Local Storage</title>
  139.  
  140.      <para>The thread local storage support in compilers is a relatively
  141.      recent phenomena. The standardization of such support for the mips32
  142.      platform is very new and even the newest versions of GCC cannot generate
  143.      100% correct code. Because of some weird MIPS processor variants, it was
  144.      decided, that the TLS pointer will be gathered not from some of the free
  145.      registers, but a special instruction was devised and the kernel is
  146.      supposed to emulate it. HelenOS expects that the TLS pointer is in the
  147.      K1 register. Upon encountering the reserved instruction exception and
  148.      checking that the application is requesting a TLS pointer, it returns
  149.      the contents of the K1 register. The K1 register is expected to point
  150.      0x7000 bytes after the beginning of the thread local data.</para>
  151.  
  152.      <figure float="1">
  153.        <title>MIPS &amp; PowerPC TLD</title>
  154.  
  155.        <mediaobject id="tldmips">
  156.          <imageobject role="pdf">
  157.            <imagedata fileref="images/tld_mips.pdf" format="PDF" />
  158.          </imageobject>
  159.  
  160.          <imageobject role="html">
  161.            <imagedata fileref="images/tld_mips.png" format="PNG" />
  162.          </imageobject>
  163.  
  164.          <imageobject role="fop">
  165.            <imagedata fileref="images/tld_mips.svg" format="SVG" />
  166.          </imageobject>
  167.        </mediaobject>
  168.      </figure>
  169.    </section>
  170.  
  171.    <section>
  172.      <title>Lazy FPU Context Switching</title>
  173.  
  174.      <para>Implementing lazy FPU switching on MIPS architecture is
  175.      straightforward. When coprocessor CP1 is disabled, any FPU intruction
  176.      raises a Coprocessor Unusable exception. The generic lazy FPU context
  177.      switch is then called that takes care of the correct context
  178.      save/restore.</para>
  179.    </section>
  180.  </section>
  181.  
  182.  <section>
  183.    <title>Power PC</title>
  184.  
  185.    <para>PowerPC allows kernel to enable mode, where data and intruction
  186.    memory reads are not translated through virtual memory mapping
  187.    (<emphasis>real mode</emphasis>). The real mode is automatically enabled
  188.    when an exception occurs. However, the kernel uses the same memory
  189.    structure as on other 32-bit platforms - physical memory is mapped into
  190.    the top 2GB, userspace memory is available in the bottom half of the
  191.    32-bit address space.</para>
  192.  
  193.    <section>
  194.      <title>OpenFirmware Boot</title>
  195.  
  196.      <para>The OpenFirmware loads an image of HelenOS operating system and
  197.      passes control to the HelenOS specific boot loader. The boot loader then
  198.      performs following tasks:</para>
  199.  
  200.      <itemizedlist>
  201.        <listitem>
  202.          <para>Fetches information from OpenFirmware regarding memory
  203.          structure, device information etc.</para>
  204.        </listitem>
  205.  
  206.        <listitem>
  207.          <para>Switches memory mapping to the real mode.</para>
  208.        </listitem>
  209.  
  210.        <listitem>
  211.          <para>Copies the kernel to proper physical address.</para>
  212.        </listitem>
  213.  
  214.        <listitem>
  215.          <para>Creates basic memory mapping and switches to the new kernel
  216.          mapping, in which the kernel can run.</para>
  217.        </listitem>
  218.  
  219.        <listitem>
  220.          <para>Passes control to the kernel <function>main_bsp</function>
  221.          function.</para>
  222.        </listitem>
  223.      </itemizedlist>
  224.    </section>
  225.  
  226.    <section>
  227.      <title>Thread Local Storage</title>
  228.  
  229.      <para>The Power PC thread local storage uses R2 register to hold an
  230.      address, that is 0x7000 bytes after the beginning of the thread local
  231.      data. Overally it is the same as on the MIPS architecture.</para>
  232.    </section>
  233.  </section>
  234.  
  235.  <section>
  236.    <title>IA-64</title>
  237.  
  238.    <para>The ia64 kernel uses 16K pages.</para>
  239.  
  240.    <section>
  241.      <title>Two IA-64 Stacks</title>
  242.  
  243.      <para>The architecture makes use of a pair of stacks. One stack is the
  244.      ordinary memory stack while the other is a special register stack. This
  245.      makes the ia64 architecture unique. HelenOS on ia64 solves the problem
  246.      by allocating two physical memory frames for thread and scheduler
  247.      stacks. The upper frame is used by the register stack while the first
  248.      frame is used by the conventional memory stack. The generic kernel and
  249.      userspace code had to be adjusted to cope with the possibility of
  250.      allocating more frames for the stack.</para>
  251.    </section>
  252.  
  253.    <section>
  254.      <title>Thread Local Storage</title>
  255.  
  256.      <para>Although thread local storage is not officially supported in
  257.      statically linked binaries, GCC supports it without any major obstacles.
  258.      The r13 register is used as a thread pointer, the thread local data
  259.      section starts at address r13+16.</para>
  260.  
  261.      <para><figure float="1">
  262.          <title>IA-64 TLD</title>
  263.  
  264.          <mediaobject id="tldia64">
  265.            <imageobject role="pdf">
  266.              <imagedata fileref="images/tld_ia64.pdf" format="PDF" />
  267.            </imageobject>
  268.  
  269.            <imageobject role="html">
  270.              <imagedata fileref="images/tld_ia64.png" format="PNG" />
  271.            </imageobject>
  272.  
  273.            <imageobject role="fop">
  274.              <imagedata fileref="images/tld_ia64.svg" format="SVG" />
  275.            </imageobject>
  276.          </mediaobject>
  277.        </figure></para>
  278.    </section>
  279.  </section>
  280. </appendix>