[CentOS] ext3 errors (md device related?)

Ross S. W. Walker rwalker at medallion.com
Fri Mar 14 20:31:32 UTC 2008


Les Mikesell wrote:
> Nicolas KOWALSKI wrote:
> > Les Mikesell <lesmikesell at gmail.com> writes:
> > 
> >> 'fsck -y' seems to fix it up, but it keeps happening. Is this likely
> >> to be leftover cruft from the hardware issues or are there problems
> >> in ext3/raid1/sata drivers? The way backuppc stores data with
> >> millions of hardlinks in the archive it isn't really practical to
> >> copy it off, reformat, and start over.
> > 
> > Maybe a memory problem:
> > 
> > 
> http://thread.gmane.org/gmane.comp.file-systems.ext3.user/3457/focus=3459
> 
> Back to this problem again.  I did a new mkfs.ext3 and ran more than a 
> week before hitting this again:
> 
> Mar 14 04:12:29 linbackup1 kernel: md3: rw=0, want=14439505280, limit=1465143808
> Mar 14 04:12:29 linbackup1 kernel: EXT3-fs error (device md3): ext3_readdir: directory #34079247 contains a hole at offset 0
> Mar 14 04:12:29 linbackup1 kernel: Aborting journal on device md3.
> Mar 14 04:12:29 linbackup1 kernel: md3: rw=0, want=5260961472, limit=1465143808
> Mar 14 04:12:29 linbackup1 kernel: EXT3-fs error (device md3): ext3_readdir: directory #34079247 contains a hole at offset 4096
> 
> I don't see any hardware related errors, and the rest of the filesystems 
> all seem fine, although this is the one that is busy.

Is your memory ECC? If not then a memory problem can fly under the radar.

> Can this be related to being on a 3-member RAID1 that normally runs with 
> one device misssing? I've run a different one that way for a couple of 
> years on earlier kernels.

I haven't seen any other dm-raid problems, and dm-raid is quite mature
at this point. I won't say it isn't possible. Can you try running with
just 2 drives for a while after this fsck and see if it happens again?

> Will it hurt anything to mount the underlying partition of one of the 
> drives directly for a while instead of using the md device?

I don't know. Depends how dm-raid keeps it's bitmap and meta-data. If
it's at the end then it should work, if it's at the beginning, then
you'd have to offset the mount (carefully).

You will need to be very careful when messing with the partition table
to change it's type and if you recreate the RAID1 again with existing
data on it (don't have a procedure for that).

-Ross

______________________________________________________________________
This e-mail, and any attachments thereto, is intended only for use by
the addressee(s) named herein and may contain legally privileged
and/or confidential information. If you are not the intended recipient
of this e-mail, you are hereby notified that any dissemination,
distribution or copying of this e-mail, and any attachments thereto,
is strictly prohibited. If you have received this e-mail in error,
please immediately notify the sender and permanently delete the
original and any copy or printout thereof.




More information about the CentOS mailing list