On Tue, 2005-06-28 at 23:12 -0400, Peter Arremann wrote:
Yes - what exactly was that page in what manual that says "PAE52" ?
AMD Architecture Programmer's Manual Volume 2: System Programming
http://www.amd.com/us- en/assets/content_type/white_papers_and_tech_docs/24593.pdf
Page 146:
"Currently, the x86-64 architecture defines a mechanism for translating 48-bit virtual addresses to 52-bit physical addresses. The mechanism used to translate a full 64-bit virtual address is reserved and will be described in a future x86-64 architectural specification."
Page 147:
"Setting CR4.PAE=1 enables virtual addresses to be translated into physical addresses up to 52 bits long. This is accomplished by doubling the size of paging data-structure entries from 32 bits to 64 bits to accommodate the larger physical base addresses for physical-pages. PAE must be enabled before activating long mode."
48-bit virtual addressing is i386 onward: 16-bit segment + 32-bit register
History ...
The 8086/8086 created a 20-bit "physical address" from a 32-bit "virtual address" by creating a two's complement from the 16-bit segment (bits 4-19) plus the 16-bit offset (bits 0-15). This process of getting 20- bit physical address from a 32-bit "virtual address" is known as "normalizing." If the two's complement is greater than bit 19, then an overflow, exception, etc... occurs.
[ SIDE NOTE: A small "hack" called the A20 line could be used to give another 16-bit 64KB "page" above 1MB, hence the common DOS location in XMS memory, not between 640-1024KB, but _at_ 1024KB. Long story short, it's when the two's complement overflows to A20, because all MSB 10-15 (A15-19) of the segment and MSB 15 (A15) of the register are set. This would result in the A20 line being set, and addresses of 1024KiB(-16) to 1088KiB(-16) being available. ]
The i386 onward normalizes the 32-bit "physical address" from a 48-bit "virtual address" by creating a two's complement from the 16-bit segment (bits 20-35) plus the 32-bit offset (bits 0-31). This 48-bit virtual -> 32-bit "normalizing" is radically speed up in the i486 TLB over the i386 (which does not have a TLB). If the two's complement is greater than bit 31 (32-bit), then an overflow, exception, etc... occurs. There is no A36 "hack" like there was for 8086/8088 though.
PAE is mode in the Pentium Pro onward, supported in GTL+, that uses the "4-bit overhang" (bits 32-35) of the 16-bit segment register to address beyond 4GiB. This would be, in 8086/8088 terminology, an A32- A35 hack. But can only do so in a way that pages into under 32-bit. This is known as PAE36 -- because it is a 36-bit processor address extensions (PAE) model, not a linear address access above 32-bit.
PAE in x86-64 still uses the 48-bit virtual addressing approach. The name "Long Mode" comes from the fact that instead of creating a two's complement of overlapping bits, the 16-bit segment and 32-bit offset are connected "Long" MSB of offset against the LSB of the register. So now 48-bit virtual addresses are actually 48-bit wide, not "normalized" to 32-bit or, using PAE (36-bit), 36-bit.
Where 52-bit comes into play is in the paging. The paging not only can do these 48-bit "Long" addresses, but also legacy 32-bit and PAE (36- bit) "Normalized" addresses. It's a bit complex, but the _full_ compatibility of running _both_ 48-bit "long" programs and 32/36-bit "normalized" programs in the same, compatible paging space is well detailed in the manual (and the choice of 52-bit wasn't arbitrary).
This 48/52-bit mode requires PAE, designed for simultaneous, paging- level compatibility with not only 32-bit, but also 36-bit (PAE36) "normalized" programs.
[ SIDE NOTE: A future x86-64 processor will offer a full 64-bit virtualized addressing. For now, it's 48-bit -- the "Long" version of a 32-bit offset register with the legacy 16-bit segment register in front of it -- _not_ overlapping/normalized. ]
[ SIDE NOTE: A future x86-64 processor will offer a full 52-bit physical addressing. For now, it's 40-bit -- i.e., the physical limitation of the EV6 interconnect. ;-> ]
That's why we refer to it as PAE 52-bit, or PAE52 for short. To differentiate from a processor that only does PAE 36-bit, PAE36 for short. Most of the time, AMD calls it x86-64 PAE.
-- Bryan
P.S. I also noted that the 4MiB Page Size actually allows up to 40-bit addressing with 32-bit offsets. I wonder if that's how the Xeon MP's page table directly and linearly addresses up to 40-bit/1TiB? As you can see from the table on page 146, AMD still uses 4KiB paging (although it _does_ offer a 2MiB paging mode -- which would definitely be GTL _incompatible_).