[CentOS] Re: BUG in fs/bio.c:99

Tue Oct 24 13:00:42 UTC 2006
J.J. Garcia <stigmatedbrain at gmail.com>

El lun, 23-10-2006 a las 16:56 +0200, J.J. Garcia escribió:
> El lun, 23-10-2006 a las 17:50 +0400, Kirill Korotaev escribió:
> > J.J. Garcia,
> > 
> > the bug you face looks exactly like the ours one.
> > I thought it is memory corruption since %eax is 8, while should be 0.
> > (BTW, can you run memtest to make sure your memory is really ok?
> > http://wiki.openvz.org/Hardware_testing ),
> > but the fact that it is always 8 in yours and our case makes me believe
> > it is something else...
> > 
> > If I provide some debugging patch for you, will you be able to apply it to your
> > kernel, rebuild it and test the issue?
> > 
> > Your help is very much appreciated.
> > 
> > Thanks,
> > Kirill
> > 
> 
> Sure i'll do my best, if you provide me the patch i can check it on the
> current host, it's not a very critycall host at the network and i think
> the bug is relevant to stop it for a while,
> 
> I've started by installing memtest86+ in the related host following the
> next steps, for your info:
> 
> <...>
> 
> =============================================================================
>  Package                 Arch       Version          Repository
> Size
> =============================================================================
> Installing:
>  memtest86+              i386       1.26-2           base
> 53 k
> 
> Transaction Summary
> =============================================================================
> Install      1 Package(s)
> Update       0 Package(s)
> Remove       0 Package(s)
> Total download size: 53 k
> Is this ok [y/N]: y
> Downloading Packages:
> (1/1): memtest86+-1.26-2. 100% |=========================|  53 kB
> 00:00
> Running Transaction Test
> Finished Transaction Test
> Transaction Test Succeeded
> Running Transaction
>   Installing: memtest86+                   #########################
> [1/1]
> 
> Installed: memtest86+.i386 0:1.26-2
> Complete!
> [root at fattybox ~]# rpm -ql memtest86+
> /boot/memtest86+-1.26
> /sbin/new-memtest-pkg
> /usr/sbin/memtest-setup
> /usr/share/doc/memtest86+-1.26
> /usr/share/doc/memtest86+-1.26/README
> 
> [root at fattybox ~]# rpm -qi memtest86+
> Name        : memtest86+                   Relocations: (not
> relocatable)
> Version     : 1.26                              Vendor: CentOS
> Release     : 2                             Build Date: lun 21 feb 2005
> 20:35:44 CET
> Install Date: lun 23 oct 2006 16:25:57 CEST      Build Host:
> bhrama.build.karan.org
> Group       : System Environment/Base       Source RPM: memtest86
> +-1.26-2.src.rpm
> Size        : 123633                           License: GPL
> Signature   : DSA/SHA1, sáb 26 feb 2005 21:59:06 CET, Key ID
> a53d0bab443e1821
> Packager    : Karanbir Singh <kbsingh at centos.org>
> URL         : http://www.memtest.org
> Summary     : Stand-alone memory tester for x86 and x86-64 computers
> Description :
> Memtest86+ is a thorough stand-alone memory test for x86 and x86-64
> architecture computers. BIOS based memory tests are only a quick
> check and often miss many of the failures that are detected by
> Memtest86+.
> 
> Run 'memtest-setup' to add to your GRUB or lilo boot menu.
> root at fattybox ~]#
> 
> Proceding with the install on boot,
> 
> [root at fattybox ~]# memtest-setup
> Setup complete.
> 
> Lead to /etc/grub.conf in the following way, i'll use it to launch the
> tests by the way:
> 
> title Memtest86+ (1.26)
>         root (hd0,0)
>         kernel /memtest86+-1.26 ro root=/dev/VolGroup00/LogVol00
> ACPI=off vga=0x307 selinux=0
> 
> 
> Since here, memtest is running using default config, feel free 2 tell me
> 2 change the default params when running if you are looking for
> something you need, i'll leave it running for 48 hours looking for
> something strange in memory.
> 
> I've to note that this host has shared memm for the graphics, iow,
> there's no graphic card but embedded one on mobo, it's a DFI CM33T3-100
> mobo (CM33-TL) with up2date bios according dfi with a intel celeron
> running. I can't assure kingstom memories... but 22.0.2 worked fine with
> this hardware previously for long time (months, and year of uptime with
> heavy loads)...
> 
> We'll keep on touch,
> 
> Jose.
> 
> 
> 

Hi again,

After almost 24 hours running memtest86+ in affected host i think it
discovered a memory corruption issue as you mentioned and it can be
checked at http://img206.imageshack.us/my.php?image=dscn2284xj4.jpg

I'm trying to solve it with a new PC133 memory module. And at the same
time maybe i can use an old video card to avoid memory sharing from mobo
embedded one to simplify things,

I'll check it then ASAP to see if the EAX register still keeps the noted
value after panic, if i can reproduce it again,

Sorry about the inconvenience, but what is strange is not having any
kind of memory corruption when 22.0.2 was used for months, really this
morning i was surprised!

Jose.