[CentOS] CentOS 4.0 -> 4.1 update failing

Tue Jun 21 18:46:35 UTC 2005
Johnny Hughes <mailing-lists at hughesjr.com>

On Tue, 2005-06-21 at 12:40 -0500, Bryan J. Smith  wrote:
> From: Dan Pritts <danno at internet2.edu>
> > I read a web page that suggested that in some cases software built for
> > "i686" would not in fact work on Via C3 processors (this is near & dear
> > to my heart since i just bought a motherboard based on one).  The C3
> > is definitely a modern platform - it's not fast by modern standards but it
> > works well enough for many applications and its heat/power requirements
> > are wonderful (circa 10 watts).  
> 
> It all depends how you define "modern."
> 
> The IDT-Centaur team found that it was very easy to build a chip that
> dedicated more transistors to a larger cache that get into a lot of pipeline
> optimizations, out-of-order execution, etc...  They were able to build the
> WinChip in 18 months -- instead of 36+ for a traditional design.  The
> first design also ran on standard 3.3V CMOS and was quite a bit more
> tolerant of variances.
> 
> Cyrix did similar with the design of the M2/Geode.  The C3 is ViA's
> evolution of the WinChip-M2 line, in a Socket-370 package for Intel
> GTL+ under official license from Intel.  As much as non-x86 platforms
> promise more performance for lower power, it's hard to best some of
> the low-power designs of the x86 world at their economies of scale.
> 
> > The discussion suggested that the "cmov" instruction was the problem.
> 
> Yes, the "cmov" instruction is an optional i686 instruction in Intel's
> own documentation.  ViA (possibly both Cyrix and IDT-Centaur too)
> probably thought it was either a waste of transistors or, more likely,
> a timing nightmare to integrate into the ALU/control.  In fact, if
> it was considered optional, it sounds like an engineer realized that
> it could have design impact when the i686 ISA was written (probably
> in advance of the actual release of the Pentium Pro, or in consideration
> of Intel's own, future i686 designs).
> 
> <flammage=on>
> Intel continues to be the poster child for 1970s CS thought when it
> comes to overbloat of an already CISC pig.  Ironically, at one time,
> they partnered with Digital on the Alpha chip, which is definitely the
> most over-anal of RISC architectures.  I.e., if it could hold up timing,
> it didn't go in the AXP ISA -- and they were extremely anal on this
> to the point of not even including an 8 or 16-bit LOAD/STOR (although
> Digital finally caved in this in 21164 -- but all other 8/16-bit data
> operations were always left out).
> </flammage>
> 
> The GCC i686 target, unfortunately, assumes it always exists.
> I can semi-understand the logic, because for a run-time to always test
> would add a a number of bytes to every single program.
> And the software workaround takes over 100x more clock cycles.
> 
> > I, of course, had similar upgrade problems to the original poster.  
> > I don't really care about the performance optimizations from "i386"
> > to "i686" as long as i'll continue to have something that works.
> 
> i486 was liberally licensed by Intel under reasonable terms after a US
> court said numbers could not be trademarked (Intel was hoping to
> make money on trademark licensing in conjunction).
> 
> i686/GTL+ was only licensed by ViA, although I'll assume the lack of
> a cmov instruction in Cyrix/IDT-Centaur designs pre-dates that license.
> 
> > One additional interesting data point.  The CentOS 4.0 installer
> > gave me the "i586" glibc and the "i686" kernel.  I would hope that
> > this would be consistent.
> 
> That's not what you want.  You want to drop down to a i486 instead
> of running i586.
> 
> Here's a general guide:  
> 
> --march=i486
> 
> Runs well on a non-superscalar i486, obviously.
> Runs fairly well on a super-scalar ix86 that does i486, although
> optimizations for the specific architecture should be used.
> Portions may run like crap oni586 (true Pentium/MMX), especially
> ALU control.
> 
> --march=i486 --mtune=i686
> 
> Still runs well on a non-superscalar i486 because superscalar
> optimizations don't affect it.
> Runs near-optimal on a superscalar ix86 that has at least the
> same 7-issue core of the Pentium Pro-P3 (2+2+3 ALU+FPU+control).
> Even more likely to run like crap on i586 because it assumes the ALU
> is 2-issue and well-designed, and i586's just ain't anywhere near 'dat!
> 
> --march=i486 --mtune=i586 (or --march=i586)
> 
> Improves performance of many operations singificantly on i586.
> On i486 (if --march=i486, --march=i586 will not run) and i686,
> will use the chip rather inefficently.
> 
> E.g., (and this is just 1 example) generated machine code will
> ripple integers through what it assumes is a pipelined FPU.
> On an i486 or anything but an Athlon clone, this will not only tie up
> the FPU, but significantly and artifically delay integer loads.
> On a true Intel i686 or AMD Athlon, it will leave the 2 and 3 ALU
> pipes, respectively, unused.
> Worse yet, the original Nx586 (through the Athlon) is 3x faster at
> ALU loads than i586, and it is a clear "de-optimization"
> 
> And that's just 1 example.  ;->
> 
> --march=i686
> 
> Offers little performance gain over --march=i486 --mtune=i686,
> while not running at all on an i486 or i586.
> 
> --march=i486|i686 --mtune=pentium4
> 
> Offers not only improved performance on the Pentium 4, but
> all Athlons as well.
> 
> --march=i486|i686 --mtune=athlon
> 
> This option has been widely debated.  It really _kills_ Intel
> performance because Intel i686's 2+2 (ALU+FPU) is not going to
> handle optimizations for AMD Athlon's 3+3 (ALU+FPU) and is
> going to have lots of stalls (especially on Pentium 4, don't get
> me started ;-).  At the same time, the Athlon benefit is
> debatable for general applications -- although engineering and
> scientific _will_ see a good boost (40% is not uncommon).
> 
> The reason why is simple.  Intel's 1 complex _or_ 2 ADD FPU
> only allows 1 MULT at full 64-bit precision (and not SSE "lossy
> math") whereas AMD's 2 complex _and_ 1 ADD/MULT allows
> 3 MULT at full 64-bit precision.
> 
> Plus, today, you're typically going to be running Athlon64/Opteron,
> and you get such "optimizations for free" on x86-64 targets.
> 
> SIDE NOTE:
> -O3 ... _never_ run -O3 with --march=pentium3, pentium4 or
> athlon.  You're asking for "lossy math" in the case of the former,
> and unstable optmizations in the case of the latter.  It should _only_
> be used when you developed the application and know what you're
> doing.

All of that is probably true ... but the optimizations are already set
by default by the config.guess and autoconf and automake for almost all
RedHat SRPMS ... and CentOS doesn't change those, unless absolutely
necessary.

Therefore, for almost all packages on CentOS-3 they are --march=i386 --
mtune=i686 ... and for most CentOS-4 packages they are --march=i386 --
mtune=pentium4 ... being that the default target is i386.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.centos.org/pipermail/centos/attachments/20050621/ffc57e53/attachment-0004.sig>