[CentOS] how to find source of data loss / corruption

Thu Dec 15 10:28:00 UTC 2011
Rudi Ahlers <Rudi at SoftDux.com>

Hi,

2 websites, hosted on 2 different CentOS 5.7 servers (one being very
new, about 3 weeks old) keeps loosing data - but it's more like it's
corrupted than being deleted.

For example, a photo would be uploaded last night and today when we
checked it, it doesn't show on the website. So we check if the file is
on the server, and exists but is 0KB in size. Last night it still
worked fine. The photo is 482Kb in size.

The first time this happened we thought it was due to a bug with
CentOS 5.7 + EXT4 + quotas (there's a bug open for this) and since the
server's console kept giving errors about possible data corruption we
thought it would be best if we move everything to a more stable
platform. So we brought a new server, setup CentOS 5.7 + ext3 + quotas
(which has been working fine on all our servers for a long time) and
moved the data across. A few days down the line and I still see this
happening.

I'm out of ideas and hope someone could shed some light on the matter.
I've checked some suggested search results, but couldn't find any
issues with the HDD according to SMART. The servers' both have 4GB RAM
and 8Core CPU's.Neither RAM, nor CPU usage is high. Both are setup
with RAID10 across 4 entrerprise HDD's, one server has software RAID
and the new one hardware RAID. So even when we changed the RAID
subsystem it still happens.



-- 
Kind Regards
Rudi Ahlers
SoftDux

Website: http://www.SoftDux.com
Technical Blog: http://Blog.SoftDux.com
Office: 087 805 9573
Cell: 082 554 7532