I am forced to use a windoze 7 box and recently MS decided in its infinite wisdom to update the nvidia driver via windoze update. My machine immediately started with the same symptoms David is having...hanging at indeterminate times, even a BSOD twice. It would do this even when idle during the night. Googling for an answer resulted in finding a forum related to the nvidia web site on which there was a post suggesting that there were a lot of problems with the current version and we should reinstall back level drivers. The post suggested going back to 314.22. I did so and have not had a single problem since. YMMV Regards, Ron Young 919-621-9015 http://www.linkedin.com/in/ronhyoung +++++++++++++++++++ Little tiny dreams require little tiny thoughts and little tiny steps. Great big dreams require great big thoughts and little tiny steps. +++++++++++++++++++ *Kosh*: The avalanche has already started. It is too late for the pebbles to vote. On Fri, Nov 15, 2013 at 6:11 AM, David McGiven <davidmcgivenn at gmail.com>wrote: > Hello there, > > I'm running a Supermicro server with the latest CentOS 6.4 versions (kernel > : 2.6.32-358.23.2.el6.x86_64) and the latest nvidia driver (331.20). > > A few minutes after using the GPU for doing some HPC calculations, the > server crashes and reboots itself. This is happening every time. I know it > will be rebooted but I don't know when. Sometimes it's 20 minutes after > starting using it. Sometimes it's 2 hours. > > If I unplug the GPU card and put some stress on the server, it works ok. So > I suspect there's a bug in the kernel/nvidia driver. > > I can't find any messages on /var/log/messages. > > What should I do ? Should I file a bug on the centos bugtracking system ? > Is there anyway I can gather more information ? The server is in a remote > location so I have a hard time accessing the console. > > Thanks. > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos >