[CentOS] [OT] Memory Models and Multi/Virtual-Cores -- WAS: 4.0 -> 4.1 update failing

Sat Jun 25 22:45:33 UTC 2005
Bryan J. Smith <b.j.smith@ieee.org> <thebs413 at earthlink.net>

From: Peter Arremann <loony at loonybin.org>
> Actually apps like the one I was referring to showed about 50%
> single thread performance gain when going from a 2.4GHZ Xeon
> to a 2.2GHz Opteron.

That's not a good comparison because the ALU-Control of the Athlon
is about 50% faster MHz for MHz than P4-Xeon.  So that could easily
be a computational benefit and not an interconnect one.

> Never assume you've done more more than others either :-)

I didn't.

> I've done the more difficult job with finding all the applicable documents

Dude, you did _not_ send me any documentation I have not already seen.
I have been using developer.intel.com years in my semiconductor design
career, as well as more of an IT-level system designer.

The problem here is that you can't seem to understand that a trace does
_not_ indicate how something actually works in the memory logic.  E.g., 
the existence of trace does not tell you if a normalized address that
results in a PAE36 (>4GiB) address is going to either:  
  A)  Directly drive that trace and fetch for the process to use
  B)  Be intercepted by the paging logic, referenced in a page table,
       then _that_ logic actually fetches the memory, which is then mapped
       into <4GiB for the process to use

It's not just the simple trace, and it's not just the simple, programmer-level
logic.  If Intel PAE36 processors didn't have address lines above bit 31,
they couldn't address above 4GiB at all!  But just because they have those
traces doesn't mean they can directly use them.

First you looked at it from the "programmer" level, then you looked at it
from the "board technician" level.  Now I'm tell you _get_in_the_chip_!

> while you just put out hearsay and "doesn't work like that" statements
> without really backing it up. And yes, IA-32 _can_ address more than
> 4GB - called PAE.

Exactly!  What I'm telling you is that _all_ Athlon "32-bit" processors can
by-pass PAE36, and significantly _increase_ performance at the page table
level.  But it requres the platform be configured in a way that is
_incompatible_ with all OSes _except_ a hack for Linux.  This is rare, but
it has been done in a few mainboards.



--
Bryan J. Smith   mailto:b.j.smith at ieee.org