On 03/26/2014 03:40 PM, m.roth@5-cent.us wrote:
Johnny Hughes wrote:
On 03/26/2014 08:14 AM, m.roth@5-cent.us wrote:
Johnny Hughes wrote:
On 03/26/2014 07:01 AM, mark wrote:
On 03/26/14 03:01, Johnny Hughes wrote:
On 03/25/2014 04:36 PM, m.roth@5-cent.us wrote: > Got a HBS (y'know, Honkin' Big Server, one o' them technical terms), > a Dell 720 with two Tesla GPUs. I updated the o/s, 6.5, and I cannot > get the GPUs recognized. As a last resort, I d/l NVidia's proprietary > driver/installer, 325, and it builds fine... I've yum removed the > kmod-nvidia I had on the system, nouveau is blacklisted, and when I > reboot, lsmod shows me nvidia loaded, which modinfo tells me looks > like the one I built.... but enum_gpu, which is from a CUDA group, > builds... but can't enumerate the GPUs (how we wake them up for the
users). I
> see the /dev/nvidia*, and they're a+r, a+w.... Oh, and selinux is > permissive. > > Anyone got a clue? If I can't get this working, I'm going to have to > downgrade the system several kernels. Do you have an /etc/X11/xorg.conf file or something in /etc/X11/xorg.conf.d/ that actually name nvidia and not nv as the driver?
Nope - nothing there.
When you run the ./NVIDIA<version> command to build the driver, one of the last steps is to have it "automatically update your configuration file" .. select yes for that and it should create an xorg.conf file that will use the nvidia driver.
a) I didn't have that before - did kmod-nvidia handle loading the correct one *without* an xorg.conf? b) Do you think it'll do the right thing - this *is* a headless server.
And a general question: what *does* kmod-nvidia do - is it different than, say, setting up a flag, or a script to notice that you're booting
a new
kernel, and run the proprietary installer -a -s?
Are you connecting to the server to do X related things remotely ... and therefore need NVIDIA drivers for that?
I think you missed that part of my original post: no X. This box has two Tesla GPUs, and my users are using them for heavy duty scientific computing.... And my problem is that neither their programs, nor the utility I use (I *think* it that it seems to be part of the CUDA toolkit - I didn't set that part up) can enumerate them... meaning that they can't see or use the GPUs.
Try to install CUDA Toolkit (https://developer.nvidia.com/cuda-downloads), see from their FAQ: *Q: *Will the installer replace the driver currently installed on my system? *A: *The installer will provide an option to install the included driver, and if selected, it will replace the driver currently on your system.
Lec
I'll let one of the elrepo guys explain their RPM.
Fair 'nough. I just threw that out as a general question, not expecting that was yours to answer.
mark
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos