On Saturday I finally upgraded a machine from CentOS 4.3 (I think) to 4.5 via yum. Seemed to went fine. However, during the following night /home got mounted read-only because of an EXT3-fs error. The next night happened the same. Also, today, I saw the first-ever kernel crash on this machine. The machine is about three years old or so, went into production two years ago with CentOS 4.1 or so and has been rock stable since then. The fs errors, no kernel crashes, no other "weird" occurences. As the problems are now happening right after upgrading to a new kernel I rather suspect a bug in the kernel (or some module) than a hardware problem. No RAID, no LVM, a few partitions on an IDE disk. I didn't file it as a bug yet. I want to first gather some more information or get some help. Here are some details.
Kernel was updated from 2.6.9-34.0.2.EL to 2.6.9-55.0.12.EL. There is not a single package update missing now.
Dec 9 04:30:35 nx10 kernel: EXT3-fs error (device hda3): htree_dirblock_to_tree: bad entry in directory #1330023: rec_len % 4 != 0 - offset=10264, inode=808542775, rec_len=13621, name_len=100 Dec 9 04:30:35 nx10 kernel: Aborting journal on device hda3. Dec 9 04:30:35 nx10 kernel: ext3_abort called. Dec 9 04:30:35 nx10 kernel: EXT3-fs error (device hda3): ext3_journal_start_sb: Detected aborted journal Dec 9 04:30:35 nx10 kernel: Remounting filesystem read-only
The second error tonight happened about 5 minutes earlier. With exactly the same directory inode. http://www.google.de/search?as_q=centos+rec_len+4+0&hl=de&num=30&... tory&as_oq=&as_eq=&lr=&cr=&as_ft=i&as_filetype=&as_qdr=all&as_occt=any&as_dt=i&as_sitesearch=&as_rights=&saf e=images shows this error is very scarce (I also tried it with fedora and got a few more). It seems to be related to heavy disk i/o, but only under certain (hardware?) circumstances and may be a bug introduced in some Fedora kernel and this krept into RHEL/CentOS 4.4/4.5. Once this happens that filesystem (in my case /home) is read-only and the machine just hangs when one tries to shutdown (probably when unmounting) or remount ro (for a file check). After a hard reset the automatic fschk in dmesg lists only an few orphan inode cleanups. Also, I found that dmesg delivers me an output of the iptables logging (which is on kern.=debug) before the problem is fixed with a reset. Can I use fsdebug safely on that system while mounted? I'm not familiar with it and just stumbled over a mention of it. I tried it on a machine here on a mounted device and there was no problem. That other machine is in a remote data center, so options are a bit limited.
The kernel crash from today starts like this: Dec 10 10:30:01 nx10 kernel: Unable to handle kernel paging request at virtual address 8f38df23 Dec 10 10:30:01 nx10 kernel: printing eip: Dec 10 10:30:01 nx10 kernel: c019190b Dec 10 10:30:01 nx10 kernel: *pde = 00000000 Dec 10 10:30:01 nx10 kernel: Oops: 0000 [#1] Dec 10 10:30:01 nx10 kernel: Modules linked in: ipt_REJECT ipt_limit ipt_state ipt_LOG iptable_filter ip_tables ip_conntrack_ftp ip_conntrack md5 ipv6 autofs4 i2c_dev i2c_core sunrpc dm_mirror dm_mod button battery ac 8139too mii ext3 jbd ata_piix libata sd_mod scsi_mod Dec 10 10:30:01 nx10 kernel: CPU: 0 Dec 10 10:30:01 nx10 kernel: EIP: 0060:[<c019190b>] Not tainted VLI Dec 10 10:30:01 nx10 kernel: EFLAGS: 00010282 (2.6.9-55.0.12.EL) Dec 10 10:30:01 nx10 kernel: EIP is at seq_escape+0x21/0xaa Dec 10 10:30:01 nx10 kernel: eax: 8f38df23 ebx: c0370260 ecx: d35a9151 edx: d35aa000 Dec 10 10:30:01 nx10 kernel: esi: c518c200 edi: c518c200 ebp: c032f9d9 esp: c63d9f28 Dec 10 10:30:01 nx10 kernel: ds: 007b es: 007b ss: 0068 Dec 10 10:30:01 nx10 kernel: Process mv (pid: 16585, threadinfo=c63d9000 task=cee5a1b0) Dec 10 10:30:01 nx10 kernel: Stack: d35aa000 8f38df23 c0370260 c518c200 dfe08982 00000000 c018e0e3 c03702c0 Dec 10 10:30:01 nx10 kernel: c518c200 00000000 dfe08982 c019157f 00000151 00000000 00000400 b7fd5000 Dec 10 10:30:01 nx10 kernel: 0000000c 00000000 0000000b 00000000 c0371300 cea00b80 00000400 c63d9fac Dec 10 10:30:01 nx10 kernel: Call Trace: Dec 10 10:30:01 nx10 kernel: [<c018e0e3>] show_vfsmnt+0x28/0xf5 Dec 10 10:30:01 nx10 kernel: [<c019157f>] seq_read+0x1c3/0x2bd Dec 10 10:30:01 nx10 kernel: [<c016c91b>] vfs_read+0xb6/0xe2 Dec 10 10:30:01 nx10 kernel: [<c016cb30>] sys_read+0x3c/0x62 Dec 10 10:30:01 nx10 kernel: [<c031b777>] syscall_call+0x7/0xb
I wonder if I can go back to 2.6.9-34.0.2.EL. Should I expect problems with other updated packages?
Kai
Kai:
Dec 9 04:30:35 nx10 kernel: EXT3-fs error (device hda3): htree_dirblock_to_tree: bad entry in directory #1330023: rec_len % 4 != 0 - offset=10264, inode=808542775, rec_len=13621, name_len=100 Dec 9 04:30:35 nx10 kernel: Aborting journal on device hda3. Dec 9 04:30:35 nx10 kernel: ext3_abort called. Dec 9 04:30:35 nx10 kernel: EXT3-fs error (device hda3): ext3_journal_start_sb: Detected aborted journal Dec 9 04:30:35 nx10 kernel: Remounting filesystem read-only
Updating to 4.5 was just a coincidence. I believe you have a disk that's going bad. I've seen this error three times and it has always been a bad disk. Backup what you can and replace the disk.
Alfred
Alfred von Campe wrote:
Kai:
Dec 9 04:30:35 nx10 kernel: EXT3-fs error (device hda3): htree_dirblock_to_tree: bad entry in directory #1330023: rec_len % 4 != 0 - offset=10264, inode=808542775, rec_len=13621, name_len=100 Dec 9 04:30:35 nx10 kernel: Aborting journal on device hda3. Dec 9 04:30:35 nx10 kernel: ext3_abort called. Dec 9 04:30:35 nx10 kernel: EXT3-fs error (device hda3): ext3_journal_start_sb: Detected aborted journal Dec 9 04:30:35 nx10 kernel: Remounting filesystem read-only
Updating to 4.5 was just a coincidence. I believe you have a disk that's going bad. I've seen this error three times and it has always been a bad disk. Backup what you can and replace the disk.
you could confirm this with smart (after backing up your data) can't remember the options, so check man smartctl but it should be something like smartctl -t short /dev/hda (perform short test) smartctl -l selftest /dev/hda (check SMART selftest log)
cheers
Nicolas Thierry-Mieg wrote on Mon, 10 Dec 2007 15:25:04 +0100:
smartctl -t short /dev/hda (perform short test) smartctl -l selftest /dev/hda (check SMART selftest log)
Nicolas, thanks for the suggestion. The short test completed without any error and all smart values shown with smartctl -a are way over the threshold, most in the 250+ area.
What I'm wondering about is the LifeTime(hours) given in the selftest log. Is this the remaining lifetime? That's still good for two years.
Kai
On Mon, 10 Dec 2007, Kai Schaetzl wrote:
Nicolas Thierry-Mieg wrote on Mon, 10 Dec 2007 15:25:04 +0100:
smartctl -t short /dev/hda (perform short test) smartctl -l selftest /dev/hda (check SMART selftest log)
Nicolas, thanks for the suggestion. The short test completed without any error and all smart values shown with smartctl -a are way over the threshold, most in the 250+ area.
What I'm wondering about is the LifeTime(hours) given in the selftest log. Is this the remaining lifetime? That's still good for two years.
That would be number of hours the drive has _already_ been running. IOW, it is telling you that the drive has been on for about two years.
The entry that usually tells you if you are developing problems are Reallocated_Sector_Ct. If that is in the hundreds (or even in the multiple dozens), you are probably looking at a drive that is going to fail in the near future.
Offline_Uncorrectable and Current_Pending_Sector are other ones you don't want to see values much above '0' on.
Benjamin Franz wrote on Mon, 10 Dec 2007 07:21:35 -0800 (PST):
That would be number of hours the drive has _already_ been running. IOW, it is telling you that the drive has been on for about two years.
Ah, so the opposite of what I thought? Don't confuse the value from the selftest log with the value from smartctl -a. There's also a value Power_On_Hours in smartctl -a which I figured would be the hours it's on. As I see now the RAW value of it is increasing, but certainly much faster as that it could be counting the hours. Ah, confirmed, I ran a second short test about one hour later and the value in the log has increased by one, so it's indeed the power on time it has had already. Thanks.
The entry that usually tells you if you are developing problems are Reallocated_Sector_Ct. If that is in the hundreds (or even in the multiple dozens), you are probably looking at a drive that is going to fail in the near future.
Offline_Uncorrectable and Current_Pending_Sector are other ones you don't want to see values much above '0' on.
These are all at 253 (raw incidents = 0).
I understand that a drive that looks really well in SMART could still fail next day, but at least there is no indication of that from this side.
Kai
Kai Schaetzl wrote on Mon, 10 Dec 2007 13:23:26 +0100:
Dec 9 04:30:35 nx10 kernel: EXT3-fs error (device hda3): htree_dirblock_to_tree: bad entry in directory #1330023: rec_len % 4 != 0 - offset=10264, inode=808542775, rec_len=13621, name_len=100
I checked the filesystem in the evening and it's clean. I really doubt there's anything with the disk. What makes me wonder is that high inode number. According to df -i that partition has a number of 3145728 inodes. And debugfs stat on inode 808542775 tells me that one doesn't exist. I don't know how I could access that directory #1330023, but I assume it doesn't exist like inode 808542775.
Kai
Kai:
I checked the filesystem in the evening and it's clean. I really doubt there's anything with the disk.
That's what I thought too. I had the same error you had, and initially the disk seemed to be OK. It would run for weeks before the error showed up again. But after I replaced the disk, the problem never occurred again. The next time I got this error (on a different system), the drive also seemed fine otherwise. I've learned my lesson. When I see this error I just replace the disk.
If you have a spare disk, I would give it a try. If your errors do not go away, then you can suspect something in the CentOS 4.5 update. But that update has been out for a while and I suspect it's running on thousands of systems without this problem.
Alfred
Alfred von Campe wrote on Tue, 11 Dec 2007 10:55:45 -0500:
If you have a spare disk, I would give it a try.
Not so easy. This is one of the few machines I have just rented in a datacenter. I had to ask them to image the disk and pay for the service.
When it happened tonight again this time I unmounted the device before doing anything else. That worked and kept the machine online. There's then indeed a corrupted directory entry that e2fsck manages to repair easily.
As this is always the same inode no. and always happening at the same time (I suspect the updatedb run, although this should not do any changes to that device) I rather suspect a bad block (or a bug). Once I know it's really updatedb doing this I plan on running bad blocks and see if that finds one.
Thanks for your telling your experience, I'll keep that in mind.
Kai