[CentOS] weird XFS problem

Boris Epstein borepstein at gmail.com
Sun Jan 22 20:05:30 UTC 2012

On Sun, Jan 22, 2012 at 2:56 PM, Joseph L. Casale
> wrote:

> >I have a CentOS 5.7 machine hosting a 16 TB XFS partition used to house
> >backups. The backups are run via rsync/rsnapshot and are large in terms of
> >the number of files: over 10 million each.
> >
> >Now the machine is not particularly powerful: it is 64-bit machine, dual
> >core CPU, 3 GB RAM. So perhaps this is a factor in why I am having the
> >following problem: once in awhile that XFS partition starts generating
> >multiple I/O errors, files that had content become 0 byte, directories
> >disappear, etc. Every time a reboot fixes that, however. So far I've
> looked
> >at logs but could not find a cause of precipitating event.
> >
> >Hence the question: has anyone experienced anything along those lines?
> What
> >could be the cause of this?
> In every situation like this that I have seen, it was hardware that never
> had
> adequate memory provisioned.
> Another consideration is you almost certainly wont be able to run a repair
> on that
> fs with so little ram.
> Finally, it would be interesting to know how you architected the storage
> hardware.
> Hardware raid, BBC, drive cache status, barrier status etc...

If I remember correctly I pretty much went with the defaults when I created
this XFS on top of a 16-drive RAID6 configuration.

Now as far as memory - I think for the purpose of XFS repair RAM and swap
ought to be the same. And I've got plenty of swap on this system. I also
host an 5 TB XFS in a file there and I ran XFS repair on it and it ran
within no more than 5 minutes. Now this is 20% of the larger XFS, roughly

I should try to collect the info you mentioned, though - that was a good
thought, some clue might be contained in there for sure.

Thanks for your input.


