i sit corrected..:)
Hilliard, Jay wrote:
Robin Mordasiewicz wrote:
Ho can I tell if a new xeon is 32 bit or 64 bit ?
If it's a new G4 (Generation 4), then it supports EM64T
-Jay _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Am Do, den 16.06.2005 schrieb Hilliard, Jay um 1:36:
Robin Mordasiewicz wrote:
Ho can I tell if a new xeon is 32 bit or 64 bit ?
If it's a new G4 (Generation 4), then it supports EM64T
-Jay
As a proof:
http://h18004.www1.hp.com/products/servers/proliantdl380/
Alexander
On Thu, 16 Jun 2005, Alexander Dalloz wrote:
Am Do, den 16.06.2005 schrieb Hilliard, Jay um 1:36:
Robin Mordasiewicz wrote:
Ho can I tell if a new xeon is 32 bit or 64 bit ?
If it's a new G4 (Generation 4), then it supports EM64T
As a proof: http://h18004.www1.hp.com/products/servers/proliantdl380/
ok I see from the HP site then that it appears that all new servers are 64 bit.
<snip http://h18004.www1.hp.com/products/servers/proliantdl380/%3E Performance
* Intel(R) Xeon processors with EM64T, 800 MHz FSB and 2MB L2 Cache * 400MHz DDR2 Memory, 6 sockets, 12GB Max * Ultra 320 Smart Array 6i w/transportable BBWC (128MB) option </snip>
I will download the ia64 bit version then.
* Intel(R) Xeon processors with EM64T, 800 MHz FSB and 2MB L2 Cache * 400MHz DDR2 Memory, 6 sockets, 12GB Max * Ultra 320 Smart Array 6i w/transportable BBWC (128MB) option
</snip>
I will download the ia64 bit version then.
You want the x86_64 version, not ia64. EM64T is AMD64 compatible Xeon's aren't compatible with ia64
-Jay
On Wed, 15 Jun 2005, Hilliard, Jay wrote:
I will download the ia64 bit version then.
You want the x86_64 version, not ia64. EM64T is AMD64 compatible Xeon's aren't compatible with ia64
oh, ok. I just torrented the ia64. I am still blown away at how short a time it takes to download a dvd iso. this is good becuase the rest of my white box servers are amd_64 bit. Thanks.
On Wed, 2005-06-15 at 19:59 -0400, Robin Mordasiewicz wrote:
On Thu, 16 Jun 2005, Alexander Dalloz wrote:
Am Do, den 16.06.2005 schrieb Hilliard, Jay um 1:36:
Robin Mordasiewicz wrote:
Ho can I tell if a new xeon is 32 bit or 64 bit ?
If it's a new G4 (Generation 4), then it supports EM64T
As a proof: http://h18004.www1.hp.com/products/servers/proliantdl380/
ok I see from the HP site then that it appears that all new servers are 64 bit.
<snip http://h18004.www1.hp.com/products/servers/proliantdl380/%3E Performance
* Intel(R) Xeon processors with EM64T, 800 MHz FSB and 2MB L2 Cache * 400MHz DDR2 Memory, 6 sockets, 12GB Max * Ultra 320 Smart Array 6i w/transportable BBWC (128MB) option
</snip>
I will download the ia64 bit version then.
No, you'll download the x86_64 version.
On Wed, 2005-06-15 at 19:59 -0400, Robin Mordasiewicz wrote:
I will download the ia64 bit version then.
On Wed, 2005-06-15 at 18:04 -0700, Hilliard, Jay wrote:
You want the x86_64 version, not ia64.
On Wed, 2005-06-15 at 21:06 -0400, Ignacio Vazquez-Abrams wrote:
No, you'll download the x86_64 version.
As everyone has mentioned, Intel Xeons are IA-32e aka "x86-64 compatible," so they run binaries built for the "x86_64" target.
From a programmer perspective, both x86-64 (AMD64) and IA-32e (EM64T)
use PAE52 (52-bit virtual addresses into 48-bit "Long Mode" addressing) with a 64-bit ALU and registers (there are some 128-bit XMM registers, but that's another story).
EM64T is AMD64 compatible
<Anal>EM64T is compatible with a _subset_ of AMD64T</Anal>
EM64T still currently runs on Intel 32-bit AGTL+ interconnect. There are some serious limitations to the physical interconnect _outside_ the CPU -- both the "memory hub" approach as well as legacy AGTL+ addressing issues, especially for I/O. The "dumb hub" and 32-bit addressing limitations are why EM64T processors lack some serious features of AMD64, like the I/O Memory Management Unit (MMU).
AMD64 is based on Digital 40-bit EV6 interconnect. It is capable of safe memory addressing up to 40-bit/1.1TB (1TiB) for _both_ programs _and_ memory mapped I/O. Intel proponents downplay this feature, but it's very much a major different in the Linux/x86-64 kernel.
Xeon's aren't compatible with ia64
IA-64 is a completely different architecture, byte code, etc...
<flammatory>IA-64 deploys CS ideals such as EPIC and Predication, things that are designed to address limitations with optimizing machine code at the compiler-only. The reality is that machine code (like boolean logic, long story) are legacy 1970s concepts for integrated circuits designed by CS majors, before physicists and engineers took over in the 1980s. You can't optimize math and algorithmic approaches that aren't ideal at the semiconductor-level anyway, and run-time optimizations in the processor design itself are the best way along with an optimizing compiler that leverages those tricks (especially in the x86 future of virtual cores of virtual, out-of-order PAE36/PAE52 machines).</flammatory>
On Wed, 2005-06-15 at 21:17 -0400, Robin Mordasiewicz wrote:
oh, ok. I just torrented the ia64.
As others said, ain't gonna run. ;->
I am still blown away at how short a time it takes to download a dvd iso. this is good becuase the rest of my white box servers are amd_64 bit.
AMD64 has the advantage of being the only commodity PC platform with a partial mesh of interconnects as well as being truly beyond the 32-bit issues at the board-level as well as the register.
Ironically, it looks like Linux isn't the best server OS for AMD64 right now (as much as we don't like to admit it). Linux is rather immature when it comes to platforms with multiple interconnects, whereas traditional RISC/UNIX platforms have been perfecting their kernels for a good decade.
On Wednesday 15 June 2005 22:10, Bryan J. Smith wrote:
EM64T is AMD64 compatible
<Anal>EM64T is compatible with a _subset_ of AMD64T</Anal>
EM64T still currently runs on Intel 32-bit AGTL+ interconnect. There are some serious limitations to the physical interconnect _outside_ the CPU -- both the "memory hub" approach as well as legacy AGTL+ addressing issues, especially for I/O. The "dumb hub" and 32-bit addressing limitations are why EM64T processors lack some serious features of AMD64, like the I/O Memory Management Unit (MMU).
<even more anal>Except the iommu, those are limitations of chipset, bus and whatever, not EM64T.</even more anal>
AMD64 is based on Digital 40-bit EV6 interconnect. It is capable of safe memory addressing up to 40-bit/1.1TB (1TiB) for _both_ programs _and_ memory mapped I/O. Intel proponents downplay this feature, but it's very much a major different in the Linux/x86-64 kernel.
EV6 is what slot/socket A was all about... The Athlon64 and Opteron (which just happen to implement the AMD64 instruction set) use HyperTransport. The instruction set has nothing to do with the interconnect - PowerPC and a whole bunch of other very use specific chips use hypertransport.
Xeon's aren't compatible with ia64
IA-64 is a completely different architecture, byte code, etc...
<flammatory>IA-64 deploys CS ideals such as EPIC and Predication, things that are designed to address limitations with optimizing machine code at the compiler-only. The reality is that machine code (like boolean logic, long story) are legacy 1970s concepts for integrated circuits designed by CS majors, before physicists and engineers took over in the 1980s. You can't optimize math and algorithmic approaches that aren't ideal at the semiconductor-level anyway, and run-time optimizations in the processor design itself are the best way along with an optimizing compiler that leverages those tricks (especially in the x86 future of virtual cores of virtual, out-of-order PAE36/PAE52 machines).</flammatory>
Nice flame - but has very little to do with real world. the IA64 architecture got the basics right... The reason for the low performance of Itanium chips (low as in real world performance compared to what it could do theoretically) are because of the immaturity of the chip (not nearly as tweaked as a P4 is) and platform (slow memory and then you expect great benchmark scores?) as well as some really really really stupid decisions. Like allocating too few bits for the template... but these things are simply bad decisions on how to implement it, not something wrong with VLIW architectures in general. In fact, if you look at the Itanium chips, they are very RISC like. To the point where a lot of guys say its a risc core with a VLIW decoder in front of it... and that the VLIW decoder happened to be the main issue is, at least to me, hysterical.
Peter.
On Wed, 2005-06-15 at 22:39 -0400, Peter Arremann wrote:
<even more anal>Except the iommu, those are limitations of chipset, bus and whatever, not EM64T.</even more anal>
Yes and no. In fact, it has to do with the fact that Intel is still relying on a chipset to do what most everyone else is doing at the CPU interconnect. Even the original Athlon MP moved many details into the CPU. Much of this was forced by the crossbar switch of Alpha EV6, because the CPU can't be segmented from the interconnect aspects if you use multiple connections.
The legacy concept that the CPU is independent of the interconnect is a viewpoint only realized by largely Intel today. When you say "chipset" -- the context is completely different between AMD and Intel. In AMD, the "chipset" is rather generic and largely glueless. With Intel, you can only have *1* point between the CPU and the "memory hub."
EV6 is what slot/socket A was all about... The Athlon64 and Opteron (which just happen to implement the AMD64 instruction set) use HyperTransport.
They only use HyperTransport as a generic transport between other HyperTransport devices -- be it another CPU or HyperTransport tunnel/bridge. But the addressing to both memory as well as virtualized over HT in the AMD64 platform is very much 40-bit EV6.
In other words, EV6 is at the heart of addressing outside the AMD64 CPU, just like on 32-bit Athlon before it (which was also capable of 40-bit, at least in the Athlon MP, long story).
The instruction set has nothing to do with the interconnect - PowerPC and a whole bunch of other very use specific chips use hypertransport.
Not true. PowerPC implementations, like the 970, that use HyperTransport do _not_ use it to the CPU. They still use an Intel like "memory hub" and single-point-of-contention. I.e., they only use it as a system interconnect for I/O, but not CPU itself. They use their own bus for CPU/memory.
Same deal for inter-bridge connections between chipsets in even AGTL+ platform like in nVidia and SiS chipsets. The value of HyperTransport is not realized. There is still only a _single_ point on interconnect to the CPU(s).
Athlon64/Opteron is the first, commodity platform to bring something of "partial mesh" to the system interconnect.
Nice flame - but has very little to do with real world. the IA64 architecture got the basics right...
I disagree entirely. The concept of optimizing the organization of machine code is based on the premise that machine code is the best way to execute instructions in silicon. That concept has been considered flawed for a long time, since the '80s. But because machine code is how everything is developed in software (even if at higher levels), that's why RISC came about. The idea to optimize the machine code for silicon considerations and run-time optimization in the chip, then hiding the added burden of the eccentric instruction set in the C compiler.
With EPIC, Intel merely thought it could do away with run-time optimization in the chip, and parallelize 3 instructions in the instruction word, to take RISC's typical 60% stage utilization closer to 100%. The reality is that unless you are parallelizing to the depth of the superscalar design in silicon, then it's rather self-defeating. I.e., you've gotta turn the _entire_ programmer world upside down and get them to think like IC design engineers (not likely). And trying to do it at _only_ compiler was just ludicrious IMHO (and makes me wonder if Intel is full of CS majors and not EEs anymore ;-). Sorry, but the reality is that you can't keep the pipes full with the approach _regardless_ of what tricks you play with the opcode+operand machine code -- it's inherit to the flaw of sending instructions to the processor in the traditional machine code string. RISC with a combination of run-time and compile-time optimization is as good as it gets, and not some CS ideal to somehow make machine code "better."
Then let's talk Predication. The other side of the concept that RISC only keeps 60% of the pipes full anyway, so we can use those extra cycles to execute both paths and just forget branch prediction and any logic dedicated to it. An analogy of this is like trying to solve the problem of DRAM read latency by adding more DRAM channels but chucking the SRAM cache because it's too costly. Sure, you're going to save on the transistor logic, but you're just going to have more overhead and the same, increased latency in the end. The chance of a branch mispredict and stall is rather small, just like a SRAM cache miss, so it's worth it to keep branch prediction around, just like SRAM cache.
The reason for the low performance of Itanium chips (low as in real world performance compared to what it could do theoretically) are because of the immaturity of the chip (not nearly as tweaked as a P4 is) and platform (slow memory and then you expect great benchmark scores?) as well as some really really really stupid decisions.
I'm not even looking at P4, but comparing EPIC to RISC of the same technology.
If "EPIC" and "Predication" are so good, why are they retrofiting run- time optimization and traditional branch prediction back into the design?
That's exactly what the Digital Semiconductor team predicted the IA-64 design teams would have to do -- predicted way back in 1997 -- years before the first IA-64 Itanium hit silicon. They explicitly stated that the concept of compiler-time-only optimization was never going to work. Intel should have listened. After all, Digital Semi basically invented every major interconnect in the '90s, as well as showed Intel how to fix their superscalar ALU in the Pentium Pro from the Pentium (hence the resulting lawsuit later).
Even Itanium2 does not compete well with aged Alpha 264 at a older larger feature size (much less the new Alpha 364 at a newer one) at its own, native instruction set. It doesn't have anything to do with memory or other technology adoption -- heck, Alpha has been well behind Itanium in getting the silicon fabrication technology and it still competes very well.
And probably the biggest insult to Itanium was the fact that Digital's Binary Translation technology from the Alpha has been adopted for Itanium. Why? Because it emulates PA-RISC and x86 _faster_ than the IA-64 can do in hardware. Digital has always been right on everything from RISC to interconnects. It's much better to build an anal RISC architecture, and then translate from one byte code to another (of the same OS), than try to build byte code compatibility in the architecture.
I'm sure IA-64 Itanium3 will benefit from completely chucking x86 and PA-RISC in the hardware thanx to Digital's technology.
Like allocating too few bits for the template... but these things are simply bad decisions on how to implement it, not something wrong with VLIW architectures in general.
Oh, I believe very much in VLIW architectures. Transmeta's design was an excellent example.
But HP-Intel's concept of pure, compiler-side optimization in EPIC and Predication was a CS ideological fantasy. And Digital Semiconductor predicted its monumentus _failure_. MDR Microprocessor Forum even opened a few years ago with a "Twilight Zone" hindsight theme where Intel decided to forget EPIC and adopt Alpha, and they were just about to release the Alpha 364 (with all the leading-edge Intel fab technology -- damn that would be tasty!).
But IA-64 is too far developed to drop now. It has already replaced Alpha because Alpha has no future beyond 364, let alone isn't designed for the latest fab technologies.
In fact, if you look at the Itanium chips, they are very RISC like. To the point where a lot of guys say its a risc core with a VLIW decoder in front of it... and that the VLIW decoder happened to be the main issue is, at least to me, hysterical.
It is the reliance on the compiler-only for optimization. It's like Intel picked a half-ass point between RISC and VLIW and said, let's merge these concepts and rely entirely on compile-time optimizations. I honestly don't know what they were thinking with EPIC -- let alone that's before we even look at Predication. As I said, it's like using more DRAM channels and chucking SRAM cache because it takes up a lot of transistors -- the reality is that 95%+ SRAM cache hits are your "best bang for the buck."
On Wed, 2005-06-15 at 22:21 -0500, Bryan J. Smith wrote:
Yes and no. In fact, it has to do with the fact that Intel is still relying on a chipset to do what most everyone else is doing at the CPU interconnect ... Not true. PowerPC implementations, like the 970, that use HyperTransport do _not_ use it to the CPU. They still use an Intel like "memory hub" and single-point-of-contention. I.e., they only use it as a system interconnect for I/O, but not CPU itself. They use their own bus for CPU/memory. Same deal for inter-bridge connections between chipsets in even AGTL+ platform like in nVidia and SiS chipsets. The value of HyperTransport is not realized. There is still only a _single_ point on interconnect to the CPU(s). Athlon64/Opteron is the first, commodity platform to bring something of "partial mesh" to the system interconnect.
Other than the IA-64 commentary, this may not seem completely Linux- related, but it goes to the _heart_ of building _quality_ Linux/x86-64 servers that have excellent I/O throughput and, to a lesser extent, latency considerations.
Here is what Intel MCH architecture looks like: http://www.samag.com/documents/sam0411b/0411b_f2.htm
This is also what PowerPC 970 also looks like -- but it uses its own system interconnect instead of AGTL+ between CPUs/memory. It does _not_ talk HyperTransport. It's basically like having an nVidia or SiS chipset on an Intel platform, the HyperTransport is only on "one side" of the MCH.
Athlon MP / Alpha 264 EV6 is also variant of this. But instead of a "hub" between CPU, memory and I/O, there is 3-16 port "switch." But it's still a single point-of-connection at the EV6 "switch."
The "stock" IA-64 Itanium2 actually still uses a single point-of- contention as well in it's Scalable Node Architecture (SNA). But most companies that implement Itanium2 solutions don't use stock SNA.
Now let's say we look at one of the proprietary Non-Uniform Memory Architecture (NUMA) offerings of previous Xeon and, more today, Itanium (I used Xeon as an example, even if NUMA Xeon implementations are rare): http://www.samag.com/documents/sam0411b/0411b_f3.htm
They solve the memory contention issues, but still _not_ the I/O ones. Worse yet, there are some "processor affinity" deficiency that could be addressed with a better design beyond just NUMA.
Now let's look at the reference design 4-way AMD 800 as implemented in the HP ProLiant DL585: http://www.samag.com/documents/sam0411b/0411b_f5.htm
Now look at that -- a partial mesh! You've got independent HyperTransport interconnects to I/O _and_ other CPUs, as well as NUMA DDR channels directly on each CPU. So not only can you have "processor affinity" when it comes to programs and data, but you can also have "processor affinity" when it comes to memory mapped I/O too! And with the I/O MMU on-chip (really just an overgrown AGPgart controller from Athlon MP, long story ;-), you've got maximum throughput with minimum context overhead for I/O.
Because servers are all about I/O.
On Wed, 2005-06-15 at 21:17 -0400, Robin Mordasiewicz wrote:
I am still blown away at how short a time it takes to download a dvd iso. this is good becuase the rest of my white box servers are amd_64 bit.
BTW, is there any reason you'all didn't buy HP DL385 (dual-Opteron) instead? Exact same 2U, redundant power, 6-disc setup as the DL380.
On Wed, 15 Jun 2005, Bryan J. Smith wrote:
On Wed, 2005-06-15 at 21:17 -0400, Robin Mordasiewicz wrote:
I am still blown away at how short a time it takes to download a dvd iso. this is good becuase the rest of my white box servers are amd_64 bit.
BTW, is there any reason you'all didn't buy HP DL385 (dual-Opteron) instead? Exact same 2U, redundant power, 6-disc setup as the DL380.
I really wish you had not of pointed that out to me. I was not aware there was a 385. If I had of known.... Its been some years since working with this series, back when they were compaq, and I liked the series then. My boss bought this 380 on my recomendation.
Oh well I am sure the 380 will serve it's purpose just as well. almost.
On Wed, 15 Jun 2005, Hilliard, Jay wrote:
Robin Mordasiewicz wrote:
Ho can I tell if a new xeon is 32 bit or 64 bit ?
If it's a new G4 (Generation 4), then it supports EM64T
well its a G4, we drove it off the lot. I see no special markings on the box. Is there something I can look for at bootup to signify 64 bit ?
considering those are xeon and not xeon "D"'s they are 32 bit.
Robin Mordasiewicz wrote:
Ho can I tell if a new xeon is 32 bit or 64 bit ?
We ordered a G3 but got a G4 instead.
DL380 G3 (No EM64T) running CentOS-3 i686 # cat /proc/cpu processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 3.06GHz stepping : 9 cpu MHz : 3051.915 cache size : 512 KB physical id : 0 siblings : 2 runqueue : 0 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm bogomips : 6094.84
DL380 G4 (With EM64T) running CentOS-3 i686 (not using 64Bit) # cat /proc/cpu processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.20GHz stepping : 1 cpu MHz : 3200.267 cache size : 1024 KB physical id : 0 siblings : 2 runqueue : 0 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm lm bogomips : 6383.20
Perhps the lm flag? Otherwise, the model number (4). cpufeature.h seems to define the lm flag as X86_FEATURE_IA64 but I though that was EPIC...
John.
William Warren wrote:
considering those are xeon and not xeon "D"'s they are 32 bit.
Robin Mordasiewicz wrote:
Ho can I tell if a new xeon is 32 bit or 64 bit ?