[CentOS] Tyan K8SRE troubles with CentOS 4.4 i386

Wed Dec 13 21:22:49 UTC 2006
John R Pierce <pierce at hogranch.com>

Dan Halbert wrote:
> We have been seeing failures with CentOS 4.4 i386 (not x86_64) running 
> compute-intensive programs on Tyan K8SRE (S2891) Tymotherboards, 
> running Opteron 265's. This motherboard is used in the Tyan barebones 
> box GT24 (B2881). We have these boards populated with 8GB of RAM, 
> consisting of mixed 2GB and 1GB sticks.
>
> The symptom is that CPU-bound programs (may or may not be related to 
> floating point) fail randomly and intermittently, with wrong answers 
> or segfaults. Running several in parallel seems to make the failures 
> more likely. We have not seen any kernel crashes. It is not hard to 
> reproduce the problem with some internal programs we have; it takes 
> only a few minutes.

as a FPU test, try this...   from a user account...

    mkdir mprime
    cd mprime
    wget ftp://mersenne.org/gimps/mprime2414.tar.gz   
    tar xzvf mprime2414.tar.gz
    ./mprime -A0 -t &
    ./mprime -A1 -t &

(if you have two dual core opterons, do this twice more with -A2 and -A3)

this will HAMMER the cpu/cache/memory bus with intensive FPU 
operations.  let it run all night on an otherwise idle box, note any 
errors spewed to the terminal.   each instance will use about 16MB of 
ram, and will be executing near peak speed FPU/SSE operations.   it 
auto-nice's  itself to minimize the impact on the rest of the system.  
your CPUs will run hotter than they've ever run before :)


hey, I thought mixing dimm sizes was verbotten on opterons?