On Thu, Oct 27, 2016 at 5:16 PM, m.roth@5-cent.us wrote:
Matt Garman wrote:
On Thu, Oct 27, 2016 at 12:03 AM, Larry Martell larry.martell@gmail.com wrote:
<snip> > On Thu, Oct 27, 2016 at 3:05 AM, Larry Martell <larry.martell@gmail.com> > wrote: >> Well I spoke too soon. The importer (the one that was initially >> hanging that I came here to fix) hung up after running 20 hours. There >> were no NFS errors or messages on neither the client nor the server. >> When I restarted it, it hung after 1 minute, Restarted it again and it >> hung after 20 seconds. After that when I restarted it it hung >> immediately. Still no NFS errors or messages. I tried running the >> process on the server and it worked fine. So I have to believe this is >> related to nobarrier. Tomorrow I will try removing that setting, but I >> am no closer to solving this and I have to leave Japan Saturday :-( >> >> The bad disk still has not been replaced - that is supposed to happen >> tomorrow, but I won't have enough time after that to draw any >> conclusions. > > I've seen behavior like that with disks that are on their way out... <snip> I just had a truly unpleasant thought, speaking of disks. Years ago, we tried some WD Green drives in our servers, and that was a disaster. In somewhere between days and weeks, the drives would go offline. I finally found out what happened: consumer-grade drives are intended for desktops, and the TLER - how long the drive keeps trying to read or write to a sector before giving up, marking the sector bad, and going somewhere else - is two *minutes*. Our servers were expecting the TLER to be 7 *seconds* or under. Any chance the client cheaped out with any of the drives?
No, it's a fairly high end Lenovo X series server (X3650 I think).