Hello all,
a couple of weeks ago I've been installing CentOS 4.2 on a very-old server machine with MSI server board based on Intel GX440 chipset, two Xeons 500Mhz and one 1Gig of RAM. There is an AMI MegaRaid 467 installed as a storage controller, which causes some troubles with installation, as stock CenOS4 install and production kernels doesn't have older megaraid.ko module compiled, but there are a lot of not so very difficult ways to overcome it. After a clean-and-relatively-fast install first of all I had up2date-d it to CentOS 4.4, did some basic initial reconfigurations, turned it off and left it lay around doing nothing.
Today its time has come, and I turned it on. While booting I've seen a message "Segmentation fault" just after a line about "Starting up LVM2". That's confused me a bit. Logged it as root, typed vgdisplay, got a normal output and a message "Segmentation fault" after it. lvdisplay performed just the same - normal display of all LVM logical volumes and a "Segmentation fault" at the bottom.
Next step was obvious: # rpm -Va
Huh, here we are. There's a bunch of RPMs with binary files in them changed since they've been installed. Just looks like it's a virus job I thinked. But, wait! That's very strange! This server has been laying around turned off and doing nothing since the moment I've done the installation of the system. There were NO possible time for a virus to infect a system. Well, in any case, I took on my special LiveCD with a ClamAV on it and a RamDisk for freshclam to store updated virus databases, booted it, mounted possibly infected system and checked it with clamscan. There were NO viruses found.
Well, I though that this might be caused by a faulty SCSI disk in array, that distort the data that's being written to it, instead of informing host that there's a bad block here. Ok, that's easy to check. Let me go to the single mode, reinstall distorted RPMs using rpm -Uvh --replacepkgs, do a couple of 'sync's, remount all filesystems with -O ro,sync, and check installed rpm's with a rpm -Va. Headed on, done all above, got nothing. After a reinstall all files became correct, and LVM tools got back to a correct behavior without "Segmentation fault". Hmm... that's strange, I thought. Well, at least ATM I've got a correctly functioning system without viruses. Huh, well, now it's time to reboot and check how does it performs. I'm going to do unattended reboots in future, it should reboot seamlessly without excess questions. # shutdown -r now Reboot went smoothly, but just as LVM2 was initializing, I've got "Segmentation fault" message again! Damn! What's wrong?! Logged in, rpm -Va - gotcha! Again, device-mapper RPM was broken. Well, let's reinstall it again, sync, remount root readonly, check with rpm -V device-mapper. Done that - all seems to be ok, no output from the rpm -V = files are intact. Rebooted again. Run: [root@omega MegaMgr5.20]# rpm -V device-mapper ..5..... /lib/libdevmapper.so.1.02
That's it. After each and every reboot I've got this file corrupted. Looks like it's not a faulty HDD trouble, and it's not a faulty RAID controller. Most likely something corrupts this file during shutdown process or during boot process. Haven't got enough time today to investigate more deeply, going to continue with it tomorrow. Will post here the results, if any.