Subversion Repositories HelenOS-doc

Rev

Rev 146 | Rev 157 | Go to most recent revision | Blame | Compare with Previous | Last modification | View Log | Download | RSS feed

<?xml version="1.0" encoding="UTF-8"?>
<appendix id="archspecs">
  <?dbhtml filename="arch.html"?>

  <title>Architecture Specific Notes</title>

  <section>
    <title>AMD64/Intel EM64T</title>

    <para>The amd64 architecture is a 64-bit extension of the older ia32
    architecture. Only 64-bit applications are supported. Creating this port
    was relatively easy, because it shares a lot of common code with ia32
    platform. However, the 64-bit extension has some specifics, which made the
    porting interesting.</para>

    <section>
      <title>Virtual Memory</title>

      <para>The amd64 architecture uses standard processor defined 4-level
      page mapping of 4KB pages. The NX(no-execute) flag on individual pages
      is fully supported.</para>
    </section>

    <section>
      <title>TLB-only Paging</title>

      <para>All memory on the amd64 architecture is memory mapped, if the
      kernel needs to access physical memory, a mapping must be created.
      During boot process the boot loader creates mapping for the first 20MB
      of physical memory. To correctly initialize the page mapping system, an
      identity mapping of whole physical memory must be created. However, to
      create the mapping it is unavoidable to allocate new - possibly unmapped
      - frames from frame allocator. The ia32 solves it by mapping first 2GB
      memory during boot process. The same solution on 64-bit platform becomes
      unfeasible because of the size of the possible address space.</para>

      <para>As soon as the exception routines are initialized, a special page
      fault exception handler is installed which provides a complete view of
      physical memory until the real page mapping system is initialized. It
      dynamically changes the page tables to always contain exactly the
      faulting address. The page then becomes cached in the TLB and on the
      next page fault the same tables can be utilized to handle another
      mapping.</para>
    </section>

    <section>
      <title>Mapping of Physical Memory</title>

      <para>The amd64 ABI document describes several modes of program layout.
      The operating system kernel should be compiled in a
      <emphasis>kernel</emphasis> mode - the kernel is located in the negative
      2 gigabytes (0xffffffff80000000-0xfffffffffffffffff) and can access data
      anywhere in the 64-bit space. This wouldn't allow kernel to see directly
      more than 2GB of physical memory. HelenOS duplicates the virtual mapping
      of the physical memory starting at 0xffff800000000000 and accesses all
      external references using this address range.</para>
    </section>

    <section>
      <title>Thread Local Storage</title>

      <para>The code accessing thread local storage uses a segment register FS
      as a base. The thread local storage is stored in the hidden 64-bit part
      of the FS register which must be written using priviledged machine
      specific instructions. Special syscall to change this register is
      provided to user applications. The TLS address for this platform is
      expected to point just after the end of the thread local data. The
      application sometimes need to get a real address of the thread local
      data in its address space but it is impossible to read the base of the
      FS segmentation register. The solution is to add the self-reference
      address to the end of thread local data, so that the application can
      read the address as %gs:0.</para>

      <figure float="1">
        <title>IA32 &amp; AMD64</title>

        <mediaobject id="tldia32">
          <imageobject role="pdf">
            <imagedata fileref="images/tld_ia32.pdf" format="PDF" />
          </imageobject>

          <imageobject role="html">
            <imagedata fileref="images/tld_ia32.png" format="PNG" />
          </imageobject>

          <imageobject role="fop">
            <imagedata fileref="images/tld_ia32.svg" format="SVG" />
          </imageobject>
        </mediaobject>
      </figure>
    </section>

    <section>
      <title>Fast SYSCALL/SYSRET Support</title>

      <para>The entry point for system calls was traditionally a speed problem
      on the ia32 architecture. The amd64 supports SYSCALL/SYSRET
      instructions. Upon encountering the SYSCALL instruction, the processor
      changes privilege mode and transfers control to an address stored in
      machine specific register. Unlike other similar instructions it does not
      change stack to a known kernel stack, which must be done by the syscall
      entry routine. A hidden part of a GS register is provided to support the
      entry routine with data needed for switching to kernel stack.</para>
    </section>

    <section>
      <title>Debugging Support</title>

      <para>To provide developers tools for finding bugs, hardware breakpoints
      and watchpoints are supported. The kernel also supports self-debugging -
      it sets watchpoints on certain data and upon every modification
      automatically checks whether a correct value was written. It is
      worthwhile to mention, that since this feature was implemented, the
      watchpoint was never fired.</para>
    </section>
  </section>

  <section>
    <title>Intel IA-32</title>

    <para>The ia32 architecture uses 4K pages and processor supported 2-level
    page tables. Along with amd64 It is one of the 2 architectures that fully
    supports SMP configurations. The architecture is mostly similar to amd64,
    it even shares a lot of code. The debugging support is the same as with
    amd64. The thread local storage uses GS register.</para>
  </section>

  <section>
    <title>32-bit MIPS</title>

    <para>Both little and big endian kernels are supported. In order to test
    different page sizes, the mips32 page size was set to 16K. The mips32
    architecture is TLB-only, the kernel simulates 2-level page tables. On
    processors that support it, lazy FPU context switching is
    implemented.</para>

    <section>
      <title>Thread Local Storage</title>

      <para>The thread local storage support in compilers is a relatively
      recent phenomena. The standardization of such support for the mips32
      platform is very new and even the newest versions of GCC cannot generate
      100% correct code. Because of some weird MIPS processor variants, it was
      decided, that the TLS pointer will be gathered not from some of the free
      registers, but a special instruction was devised and the kernel is
      supposed to emulate it. HelenOS expects that the TLS pointer is in the
      K1 register. Upon encountering the reserved instruction exception and
      checking that the application is requesting a TLS pointer, it returns
      the contents of the K1 register. The K1 register is expected to point
      0x7000 bytes after the beginning of the thread local data.</para>

      <figure float="1">
        <title>MIPS &amp; PPC</title>

        <mediaobject id="tldmips">
          <imageobject role="pdf">
            <imagedata fileref="images/tld_mips.pdf" format="PDF" />
          </imageobject>

          <imageobject role="html">
            <imagedata fileref="images/tld_mips.png" format="PNG" />
          </imageobject>

          <imageobject role="fop">
            <imagedata fileref="images/tld_mips.svg" format="SVG" />
          </imageobject>
        </mediaobject>
      </figure>
    </section>

    <section>
      <title>Lazy FPU Context Switching</title>

      <para>Implementing lazy FPU switching on MIPS architecture is
      straightforward. When coprocessor CP1 is disabled, any FPU intruction
      raises a Coprocessor Unusable exception. The generic lazy FPU context
      switch is then called that takes care of the correct context
      save/restore.</para>
    </section>
  </section>

  <section>
    <title>Power PC</title>

    <para>PowerPC allows kernel to enable mode, where data and intruction
    memory reads are not translated through virtual memory mapping
    (<emphasis>real mode</emphasis>). The real mode is automatically enabled
    when an exception occurs. However, the kernel uses the same memory
    structure as on other 32-bit platforms - physical memory is mapped into
    the top 2GB, userspace memory is available in the bottom half of the
    32-bit address space.</para>

    <section>
      <title>OpenFirmware Boot</title>

      <para>The OpenFirmware loads an image of HelenOS operating system and
      passes control to the HelenOS specific boot loader. The boot loader then
      performs following tasks:</para>

      <itemizedlist>
        <listitem>
          <para>Fetches information from OpenFirmware regarding memory
          structure, device information etc.</para>
        </listitem>

        <listitem>
          <para>Switches memory mapping to the real mode.</para>
        </listitem>

        <listitem>
          <para>Copies the kernel to proper physical address.</para>
        </listitem>

        <listitem>
          <para>Creates basic memory mapping and switches to the new kernel
          mapping, in which the kernel can run.</para>
        </listitem>

        <listitem>
          <para>Passes control to the kernel <function>main_bsp</function>
          function.</para>
        </listitem>
      </itemizedlist>
    </section>

    <section>
      <title>Thread Local Storage</title>

      <para>The Power PC thread local storage uses R2 register to hold an
      address, that is 0x7000 bytes after the beginning of the thread local
      data. Overally it is the same as on the MIPS architecture.</para>
    </section>
  </section>

  <section>
    <title>IA-64</title>

    <para>The ia64 kernel uses 16K pages.</para>

    <section>
      <title>Two IA-64 Stacks</title>

      <para>The architecture makes use of a pair of stacks. One stack is the
      ordinary memory stack while the other is a special register stack. This
      makes the ia64 architecture unique. HelenOS on ia64 solves the problem
      by allocating two physical memory frames for thread and scheduler
      stacks. The upper frame is used by the register stack while the first
      frame is used by the conventional memory stack. The generic kernel and
      userspace code had to be adjusted to cope with the possibility of
      allocating more frames for the stack.</para>
    </section>

    <section>
      <title>Thread Local Storage</title>

      <para>Although thread local storage is not officially supported in
      statically linked binaries, GCC supports it without any major obstacles.
      The r13 register is used as a thread pointer, the thread local data
      section starts at address r13+16.</para>

      <para><figure float="1">
          <title>IA64</title>

          <mediaobject id="tldia64">
            <imageobject role="pdf">
              <imagedata fileref="images/tld_ia64.pdf" format="PDF" />
            </imageobject>

            <imageobject role="html">
              <imagedata fileref="images/tld_ia64.png" format="PNG" />
            </imageobject>

            <imageobject role="fop">
              <imagedata fileref="images/tld_ia64.svg" format="SVG" />
            </imageobject>
          </mediaobject>
        </figure></para>
    </section>
  </section>
</appendix>

Generated by GNU Enscript 1.6.6.