On Sat, Nov 1, 2008 at 6:31 AM, Kai Schaetzl maillists@conactive.com wrote:
Mhr wrote on Wed, 29 Oct 2008 17:59:40 -0700:
The one problem I've seen and posted here was w.r.t. smartd error reports showing 2^32 - 1 errors on one of the disks (probably my system disk) every few minutes.
How has this anything to do with "SATA problems/drive handling"?
Possibly because my system drive is a SATA disk? (FTR, the drive does not appear to be the slightest bit unstable and it runs just fine. In fact, I recently modified the system so that it now runs on three SATA-2 drives exclusively. For whatever reason, the WD drives do not report any errors - see also below.)
And could you please use a decent subject next time?
When I select the subject, I usually do. This was a reply to a thread, so I didn't pick the subject. There's no need to be testy....
Regarding your problem: Have you done a smartctl selftest since then, did you go to smartmontools.sf.net since then and read up on smartmon?
Yes and not until now, in that order. The smartctl selftest has the same problem, IIRC, but the seatools test showed nothing wrong.
This may just be a problem with smartd not being able to handle the error codes/number of errors from that disk. If you look at smartmontools.sf.net and read the man you'll see that vendors are quite inconsistent in what and how they report and a reversal of byte ordering every now and then seems to be common. Not to mention that ther smartmon shipping with CentOS naturally doesn't include the latest code.
All good information, thank you. I did not see anything specific to the issue I am seeing, which is that every half hour, smartd reports the following:
Nov 2 01:56:11 mhrichter smartd[3121]: Device: /dev/sda, 4294967295 Currently unreadable (pending) sectors Nov 2 01:56:11 mhrichter smartd[3121]: Device: /dev/sda, 4294967295 Offline uncorrectable sectors
In each case, it also sends a warning email to root, which is kind of annoying since these do not appear to be legitimate error conditions.
Someone mentioned that this is a recurring problem with Seagate drives - more info, please?
Thanks.
mhr