on one of my Centos 5.4 boxes, a machine that's around 5 years old and has always run Centos and X, and has had the same Nvidia card in it for its entire life, I'm suddenly getting these lines in the xorg log file:
(WW) NVIDIA(0): WAIT (2, 6, 0x8000, 0x00000000, 0x000003c0) (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting ***
followed by millions more like them. it's currently sitting mostly unused, except that it's running a Folding At Home client. it'll run for a day or a week then I'll note the screen is in powersave mode, will attempt to wake it up by moving the mouse or touching a key and nothing happens. (for purposes of this testing I've disabled the screensaver, so the screen should never shut off.) I can SOMETIMES ssh into it, if it hasn't completely died yet, and if I can, top shows that Xorg is consuming about 99% of the CPU instead of the folding at home client which should be consuming most of it. I guess it's busy spewing errors to the Xorg log file, or something.
The last time I found it like this I tried to kill xorg. kill -15 wouldn't kill it so I tried kill -9, then the whole box seized up, requiring a reset.
sometimes it's already unresponsive, even to ssh or ping, by the time I notice the problem.
it's running an old Nvidia Geforce 4 MX400 card and the Nvidia proprietary driver NVIDIA-Linux-x86-96.43.01-pkg1.run, downloaded directly from the nvidia web site. the date on that file is Oct 2007, so that means that's the one that it has been using since then. but it's only the last 6 weeks or so it's started doing this.
other than some vaguely-defined "nvidia driver problems", what may be going on here? (I can buy an nvidia conflict with the stock centos kernel, but I'd be dubious, since it's been running this driver for two years and earlier versions for 4 or 5. any such conflict would seem likely a kernel issue, to me, since it's newly exhibited.)
Thanks!
Fred
At Mon, 7 Dec 2009 18:58:40 -0500 CentOS mailing list centos@centos.org wrote:
on one of my Centos 5.4 boxes, a machine that's around 5 years old and has always run Centos and X, and has had the same Nvidia card in it for its entire life, I'm suddenly getting these lines in the xorg log file:
(WW) NVIDIA(0): WAIT (2, 6, 0x8000, 0x00000000, 0x000003c0) (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting ***
followed by millions more like them. it's currently sitting mostly unused, except that it's running a Folding At Home client. it'll run for a day or a week then I'll note the screen is in powersave mode, will attempt to wake it up by moving the mouse or touching a key and nothing happens. (for purposes of this testing I've disabled the screensaver, so the screen should never shut off.) I can SOMETIMES ssh into it, if it hasn't completely died yet, and if I can, top shows that Xorg is consuming about 99% of the CPU instead of the folding at home client which should be consuming most of it. I guess it's busy spewing errors to the Xorg log file, or something.
The last time I found it like this I tried to kill xorg. kill -15 wouldn't kill it so I tried kill -9, then the whole box seized up, requiring a reset.
sometimes it's already unresponsive, even to ssh or ping, by the time I notice the problem.
it's running an old Nvidia Geforce 4 MX400 card and the Nvidia proprietary driver NVIDIA-Linux-x86-96.43.01-pkg1.run, downloaded directly from the nvidia web site. the date on that file is Oct 2007, so that means that's the one that it has been using since then. but it's only the last 6 weeks or so it's started doing this.
other than some vaguely-defined "nvidia driver problems", what may be going on here? (I can buy an nvidia conflict with the stock centos kernel, but I'd be dubious, since it's been running this driver for two years and earlier versions for 4 or 5. any such conflict would seem likely a kernel issue, to me, since it's newly exhibited.)
Random thoughts: it is possible that the video card has 'died' on some level -- such as its on-board memory is developing errors or some component has shifted out of spec, etc. Does the card have a fan on it (some video cards have little processor elements with heat sinks AND fans). Is the card covered in a thick layer of dust (and thus not being cooled properly)? Are all of the case fans on this '5 year old box' working? Is the PSU still in spec? Is its fan working? I'd try cracking the case and have a serious visit with Mr. Vacume Cleaner. Then powering it back up (with the case cover off) and checking to see if the fans are *all* working.
Thanks!
Fred
On Mon, Dec 07, 2009 at 07:24:01PM -0500, Robert Heller wrote:
At Mon, 7 Dec 2009 18:58:40 -0500 CentOS mailing list centos@centos.org wrote:
on one of my Centos 5.4 boxes, a machine that's around 5 years old and has always run Centos and X, and has had the same Nvidia card in it for its entire life, I'm suddenly getting these lines in the xorg log file:
(WW) NVIDIA(0): WAIT (2, 6, 0x8000, 0x00000000, 0x000003c0) (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting *** (EE) NVIDIA(0): Failed to allocate clip rectangle (EE) NVIDIA(0): *** Aborting ***
followed by millions more like them. it's currently sitting mostly unused, except that it's running a Folding At Home client. it'll run for a day or a week then I'll note the screen is in powersave mode, will attempt to wake it up by moving the mouse or touching a key and nothing happens. (for purposes of this testing I've disabled the screensaver, so the screen should never shut off.) I can SOMETIMES ssh into it, if it hasn't completely died yet, and if I can, top shows that Xorg is consuming about 99% of the CPU instead of the folding at home client which should be consuming most of it. I guess it's busy spewing errors to the Xorg log file, or something.
The last time I found it like this I tried to kill xorg. kill -15 wouldn't kill it so I tried kill -9, then the whole box seized up, requiring a reset.
sometimes it's already unresponsive, even to ssh or ping, by the time I notice the problem.
it's running an old Nvidia Geforce 4 MX400 card and the Nvidia proprietary driver NVIDIA-Linux-x86-96.43.01-pkg1.run, downloaded directly from the nvidia web site. the date on that file is Oct 2007, so that means that's the one that it has been using since then. but it's only the last 6 weeks or so it's started doing this.
other than some vaguely-defined "nvidia driver problems", what may be going on here? (I can buy an nvidia conflict with the stock centos kernel, but I'd be dubious, since it's been running this driver for two years and earlier versions for 4 or 5. any such conflict would seem likely a kernel issue, to me, since it's newly exhibited.)
Random thoughts: it is possible that the video card has 'died' on some level -- such as its on-board memory is developing errors or some component has shifted out of spec, etc. Does the card have a fan on it (some video cards have little processor elements with heat sinks AND fans). Is the card covered in a thick layer of dust (and thus not being cooled properly)? Are all of the case fans on this '5 year old box' working? Is the PSU still in spec? Is its fan working? I'd try cracking the case and have a serious visit with Mr. Vacume Cleaner. Then powering it back up (with the case cover off) and checking to see if the fans are *all* working.
it's actually a brand new PS. I had to steal the PS from this computer for another more critical one whose PS up and died. Did some research and found that the 500W Antec Basiq 500 was alleged to be a good one for being modestly priced, so I bought one (the original PS was also an Antec, 350W, and it ran flawlessly for all those years). the trouble did appear a while afterthat, so it could be something weird about the PS. I don't have a 'scope to view the noise levels of the output, but at least the voltages all look good on my DVM while under "normal" load.
the video card does have a fan, I'll have to look at it again--last time I checked it was running fine. there are actually six fans in the computer, and all were fine at last check.
little dirt in it--there's a filter on the intake in front (with a 120mm intake fan) that I cleaned while I had it apart to steal its PS, and mopped up what other loose filth I could find. mostly, the collection of cat fur on the input filter nicely blocked smaller bits of dust from entering. :) It usually gets opened up and the fur cleaned out twice a year, more or less.
Thanks!
Fred
-- Robert Heller -- 978-544-6933 Deepwoods Software -- Download the Model Railroad System http://www.deepsoft.com/ -- Binaries for Linux and MS-Windows heller@deepsoft.com -- http://www.deepsoft.com/ModelRailroadSystem/
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
fred smith wrote:
other than some vaguely-defined "nvidia driver problems", what may be going on here? (I can buy an nvidia conflict with the stock centos kernel, but I'd be dubious, since it's been running this driver for two years and earlier versions for 4 or 5. any such conflict would seem likely a kernel issue, to me, since it's newly exhibited.)
hardware going bad? Is the card active or passively cooled?
Try the latest driver? Have any other cards that are the same model# you could try in that box?
nate
On Mon, Dec 07, 2009 at 04:46:50PM -0800, nate wrote:
fred smith wrote:
other than some vaguely-defined "nvidia driver problems", what may be going on here? (I can buy an nvidia conflict with the stock centos kernel, but I'd be dubious, since it's been running this driver for two years and earlier versions for 4 or 5. any such conflict would seem likely a kernel issue, to me, since it's newly exhibited.)
hardware going bad? Is the card active or passively cooled?
see otherposting about fans and PS. new PS. all fans working.
Try the latest driver? Have any other cards that are the same model# you could try in that box?
Hmm. not the same, but there is a spare higher-spec/newer nvidia card around I could stick in it, also AGP, so it would fit.