[CentOS] Crash and automatical reboot when using the NVIDIA card

Fri Nov 15 15:06:44 UTC 2013
Ron Young <ron at prismsts.com>

I am forced to use a windoze 7 box and recently MS decided in its infinite
wisdom to update the nvidia driver via windoze update.  My machine
immediately started with the same symptoms David is having...hanging at
indeterminate times, even a BSOD twice.  It would do this even when idle
during the night.

Googling for an answer resulted in finding a forum related to the nvidia
web site on which there was a post suggesting that there were a lot of
problems with the current version and we should reinstall back level
drivers.  The post suggested going back to 314.22.  I did so and have not
had a single problem since.

YMMV



Regards,

Ron Young
919-621-9015
http://www.linkedin.com/in/ronhyoung

+++++++++++++++++++
Little tiny dreams require little tiny thoughts and little tiny steps.
Great big dreams require great big thoughts and little tiny steps.
+++++++++++++++++++
*Kosh*: The avalanche has already started. It is too late for the pebbles
to vote.


On Fri, Nov 15, 2013 at 6:11 AM, David McGiven <davidmcgivenn at gmail.com>wrote:

> Hello there,
>
> I'm running a Supermicro server with the latest CentOS 6.4 versions (kernel
> : 2.6.32-358.23.2.el6.x86_64) and the latest nvidia driver (331.20).
>
> A few minutes after using the GPU for doing some HPC calculations, the
> server crashes and reboots itself. This is happening every time. I know it
> will be rebooted but I don't know when. Sometimes it's 20 minutes after
> starting using it. Sometimes it's 2 hours.
>
> If I unplug the GPU card and put some stress on the server, it works ok. So
> I suspect there's a bug in the kernel/nvidia driver.
>
> I can't find any messages on /var/log/messages.
>
> What should I do ? Should I file a bug on the centos bugtracking system ?
> Is there anyway I can gather more information ? The server is in a remote
> location so I have a hard time accessing the console.
>
> Thanks.
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>