[CentOS] Re: 32/48-bit virtual addressing in 20/32/36/52-bit physical addressing -- WAS: Memory Models and Multi/Virtual-Cores

Bryan J. Smith b.j.smith at ieee.org
Wed Jun 29 03:50:26 UTC 2005


On Tue, 2005-06-28 at 23:12 -0400, Peter Arremann wrote:
> Yes - what exactly was that page in what manual that says "PAE52" ?

AMD Architecture Programmer's Manual Volume 2:  System Programming

http://www.amd.com/us-
en/assets/content_type/white_papers_and_tech_docs/24593.pdf

Page 146:  

  "Currently, the x86-64 architecture defines a mechanism for
   translating 48-bit virtual addresses to 52-bit physical addresses.
   The mechanism used to translate a full 64-bit virtual address is
   reserved and will be described in a future x86-64 architectural
   specification."

Page 147:  

  "Setting CR4.PAE=1 enables virtual addresses to be translated
   into physical addresses up to 52 bits long. This is accomplished
   by doubling the size of paging data-structure entries from 32
   bits to 64 bits to accommodate the larger physical base
   addresses for physical-pages.
   PAE must be enabled before activating long mode."

48-bit virtual addressing is i386 onward:
  16-bit segment + 32-bit register

History ...

The 8086/8086 created a 20-bit "physical address" from a 32-bit "virtual
address" by creating a two's complement from the 16-bit segment (bits
4-19) plus the 16-bit offset (bits 0-15).  This process of getting  20-
bit physical address from a 32-bit "virtual address" is known as
"normalizing."  If the two's complement is greater than bit 19, then an
overflow, exception, etc... occurs.

[ SIDE NOTE:  A small "hack" called the A20 line could be used to give
another 16-bit 64KB "page" above 1MB, hence the common DOS location in
XMS memory, not between 640-1024KB, but _at_ 1024KB.  Long story short,
it's when the two's complement overflows to A20, because all MSB 10-15
(A15-19) of the segment and MSB 15 (A15) of the register are set.  This
would result in the A20 line being set, and addresses of 1024KiB(-16) to
1088KiB(-16) being available. ]

The i386 onward normalizes the 32-bit "physical address" from a 48-bit
"virtual address" by creating a two's complement from the 16-bit segment
(bits 20-35) plus the 32-bit offset (bits 0-31).  This 48-bit virtual ->
32-bit "normalizing" is radically speed up in the i486 TLB over the i386
(which does not have a TLB).  If the two's complement is greater than
bit 31 (32-bit), then an overflow, exception, etc... occurs.  There is
no A36 "hack" like there was for 8086/8088 though.

PAE is mode in the Pentium Pro onward, supported in GTL+, that uses
the "4-bit overhang" (bits 32-35) of the 16-bit segment register to
address beyond 4GiB.  This would be, in 8086/8088 terminology, an A32-
A35 hack.  But can only do so in a way that pages into under 32-bit.
This is known as PAE36 -- because it is a 36-bit processor address
extensions (PAE) model, not a linear address access above 32-bit.

PAE in x86-64 still uses the 48-bit virtual addressing approach.  The
name "Long Mode" comes from the fact that instead of creating a two's
complement of overlapping bits, the 16-bit segment and 32-bit offset are
connected "Long" MSB of offset against the LSB of the register.  So now
48-bit virtual addresses are actually 48-bit wide, not "normalized" to
32-bit or, using PAE (36-bit), 36-bit.

Where 52-bit comes into play is in the paging.  The paging not only can
do these 48-bit "Long" addresses, but also legacy 32-bit and PAE (36-
bit) "Normalized" addresses.  It's a bit complex, but the _full_
compatibility of running _both_ 48-bit "long" programs and 32/36-bit
"normalized" programs in the same, compatible paging space is well
detailed in the manual (and the choice of 52-bit wasn't arbitrary).

This 48/52-bit mode requires PAE, designed for simultaneous, paging-
level compatibility with not only 32-bit, but also 36-bit (PAE36)
"normalized" programs.

[ SIDE NOTE:  A future x86-64 processor will offer a full 64-bit
virtualized addressing.  For now, it's 48-bit -- the "Long" version of a
32-bit offset register with the legacy 16-bit segment register in front
of it -- _not_ overlapping/normalized. ]

[ SIDE NOTE:  A future x86-64 processor will offer a full 52-bit
physical addressing.  For now, it's 40-bit -- i.e., the physical
limitation of the EV6 interconnect.  ;-> ]

That's why we refer to it as PAE 52-bit, or PAE52 for short.  To
differentiate from a processor that only does PAE 36-bit, PAE36 for
short.  Most of the time, AMD calls it x86-64 PAE.

-- Bryan

P.S.  I also noted that the 4MiB Page Size actually allows up to 40-bit
addressing with 32-bit offsets.  I wonder if that's how the Xeon MP's
page table directly and linearly addresses up to 40-bit/1TiB?  As you
can see from the table on page 146, AMD still uses 4KiB paging (although
it _does_ offer a 2MiB paging mode -- which would definitely be GTL
_incompatible_).

-- 
Bryan J. Smith                                     b.j.smith at ieee.org 
--------------------------------------------------------------------- 
It is mathematically impossible for someone who makes more than you
to be anything but richer than you.  Any tax rate that penalizes them
will also penalize you similarly (to those below you, and then below
them).  Linear algebra, let alone differential calculus or even ele-
mentary concepts of limits, is mutually exclusive with US journalism.
So forget even attempting to explain how tax cuts work.  ;->





More information about the CentOS mailing list