This is really frustrating. I've got a server with two K20c Tesla cards. I need to use the proprietary drivers to use the CUDA toolkit. Btw, I had no trouble at all with building for CentOS 7.3
I have what NVidia claims is the correct driver package, a 340 series. It appears to build, but then fails to load. The only error I see is "no such device", which makes no sense to me, esp. since it says nothing whatever else.
I've gone through the install log, and there are a bunch of Note:, and warnings, but the later I think are all about comparing signed and unsigned integers.
And lsmod shows no nvidia drivers registered, but the logs claims that Error: Driver 'nvidia' is already registered, aborting...
Anyone got any ideas?
mark
On Tue, Sep 26, 2017 at 01:40:54PM -0400, m.roth@5-cent.us wrote:
This is really frustrating. I've got a server with two K20c Tesla cards. I need to use the proprietary drivers to use the CUDA toolkit. Btw, I had no trouble at all with building for CentOS 7.3
I have what NVidia claims is the correct driver package, a 340 series. It appears to build, but then fails to load. The only error I see is "no such device", which makes no sense to me, esp. since it says nothing whatever else.
Why not use the elrepo repo? They've worked flawlessly for me, both with legacy and new cards.
On Tue, Sep 26, 2017 at 1:59 PM, Scott Robbins scottro11@gmail.com wrote:
On Tue, Sep 26, 2017 at 01:40:54PM -0400, m.roth@5-cent.us wrote:
This is really frustrating. I've got a server with two K20c Tesla cards.
I
need to use the proprietary drivers to use the CUDA toolkit. Btw, I had
no
trouble at all with building for CentOS 7.3
I have what NVidia claims is the correct driver package, a 340 series. It appears to build, but then fails to load. The only error I see is "no
such
device", which makes no sense to me, esp. since it says nothing whatever else.
Why not use the elrepo repo? They've worked flawlessly for me, both with legacy and new cards.
-- Scott Robbins PGP keyID EB3467D6 ( 1B48 077D 66F6 9DB0 FDC2 A409 FA54 EB34 67D6 ) gpg --keyserver pgp.mit.edu --recv-keys EB3467D6
Seconded. We use the elrepo repository for hundreds of workstations and have had no issues. Takes care of everything automatically.
On Tue, 2017-09-26 at 14:18 -0400, Phelps, Matthew wrote:
On Tue, Sep 26, 2017 at 1:59 PM, Scott Robbins scottro11@gmail.com wrote:
On Tue, Sep 26, 2017 at 01:40:54PM -0400, m.roth@5-cent.us wrote:
This is really frustrating. I've got a server with two K20c Tesla cards.
I
need to use the proprietary drivers to use the CUDA toolkit. Btw, I had
no
trouble at all with building for CentOS 7.3
I have what NVidia claims is the correct driver package, a 340 series. It appears to build, but then fails to load. The only error I see is "no
such
device", which makes no sense to me, esp. since it says nothing whatever else.
Why not use the elrepo repo? They've worked flawlessly for me, both with legacy and new cards.
Seconded. We use the elrepo repository for hundreds of workstations and have had no issues. Takes care of everything automatically.
Yes, but these are Tesla cards with the CUDA toolkit - I've never got the elrepo versions to run work properly when developing CUDA applications.
P.
Le 26/09/2017 à 19:59, Scott Robbins a écrit :
Why not use the elrepo repo? They've worked flawlessly for me, both with legacy and new cards.
I know this is weird, but I've had cases where the downloaded NVidia driver worked and the ELRepo driver didn't, and the other way around.
Details here: https://blog.microlinux.fr/nvidia-centos/
Niki
-----Original Message----- From: CentOS [mailto:centos-bounces@centos.org] On Behalf Of Nicolas Kovacs Sent: den 26 september 2017 23:47 To: centos@centos.org Subject: Re: [CentOS] Semi-OT: hardware: NVidia proprietary driver, C7.4
Le 26/09/2017 à 19:59, Scott Robbins a écrit :
Why not use the elrepo repo? They've worked flawlessly for me, both with legacy and new cards.
I know this is weird, but I've had cases where the downloaded NVidia driver worked and the ELRepo driver didn't, and the other way around.
Details here: https://blog.microlinux.fr/nvidia-centos/
After upgrading from 7.3 to 7.4 the GUI won't start anymore. Using and Nvidia GTX260 with the elrepo drivers. Am investigating, but so far zip...
Is the article above provided available in English?
-- //Sorin
On 26/09/17 18:40, m.roth@5-cent.us wrote:
This is really frustrating. I've got a server with two K20c Tesla cards. I need to use the proprietary drivers to use the CUDA toolkit. Btw, I had no trouble at all with building for CentOS 7.3
I have what NVidia claims is the correct driver package, a 340 series. It appears to build, but then fails to load. The only error I see is "no such device", which makes no sense to me, esp. since it says nothing whatever else.
I've gone through the install log, and there are a bunch of Note:, and warnings, but the later I think are all about comparing signed and unsigned integers.
And lsmod shows no nvidia drivers registered, but the logs claims that Error: Driver 'nvidia' is already registered, aborting...
Anyone got any ideas?
mark
You don't say which version of the 340 series driver you have tried.
There was a bug with recent legacy releases that affected el7.4 kernels. We (elrepo) patched the driver to fix that on rhel7.4 releases. I'm not sure but it _may_ have been fixed in the 340.104 driver released last week - I've not bothered building it as the changelog only mentions "Improved compatibility with recent Linux kernels" which we patched/fixed in our the previous release and other issues which don't affect kmods on RHEL.
So it sounds like a known issue which has already been fixed. If you don't want to use our packages, maybe take a look at the patch and try applying it to your build.
-----Original Message----- From: CentOS [mailto:centos-bounces@centos.org] On Behalf Of Phil Perry Sent: den 26 september 2017 21:46 To: centos@centos.org Subject: Re: [CentOS] Semi-OT: hardware: NVidia proprietary driver, C7.4
On 26/09/17 18:40, m.roth@5-cent.us wrote:
This is really frustrating. I've got a server with two K20c Tesla cards.
I
need to use the proprietary drivers to use the CUDA toolkit. Btw, I had
no
trouble at all with building for CentOS 7.3
I have what NVidia claims is the correct driver package, a 340 series.
It
appears to build, but then fails to load. The only error I see is "no
such
device", which makes no sense to me, esp. since it says nothing whatever else.
I've gone through the install log, and there are a bunch of Note:, and warnings, but the later I think are all about comparing signed and unsigned integers.
And lsmod shows no nvidia drivers registered, but the logs claims that Error: Driver 'nvidia' is already registered, aborting...
Anyone got any ideas?
mark
You don't say which version of the 340 series driver you have tried.
There was a bug with recent legacy releases that affected el7.4 kernels. We (elrepo) patched the driver to fix that on rhel7.4 releases. I'm not sure but it _may_ have been fixed in the 340.104 driver released last week - I've not bothered building it as the changelog only mentions "Improved compatibility with recent Linux kernels" which we patched/fixed in our the previous release and other issues which don't affect kmods on RHEL.
So it sounds like a known issue which has already been fixed. If you don't want to use our packages, maybe take a look at the patch and try applying it to your build.
Tested 340.76, 340.102, 340.104 (elrepo and proprietary). No luck over here with a GTX260 and the 64b-drivers.
Will test some more, if still no luck, I'll just reinstall from scratch.
On 27/09/17 07:56, Sorin Srbu wrote:
-----Original Message----- From: CentOS [mailto:centos-bounces@centos.org] On Behalf Of Phil Perry Sent: den 26 september 2017 21:46 To: centos@centos.org Subject: Re: [CentOS] Semi-OT: hardware: NVidia proprietary driver, C7.4
On 26/09/17 18:40, m.roth@5-cent.us wrote:
This is really frustrating. I've got a server with two K20c Tesla cards.
I
need to use the proprietary drivers to use the CUDA toolkit. Btw, I had
no
trouble at all with building for CentOS 7.3
I have what NVidia claims is the correct driver package, a 340 series.
It
appears to build, but then fails to load. The only error I see is "no
such
device", which makes no sense to me, esp. since it says nothing whatever else.
I've gone through the install log, and there are a bunch of Note:, and warnings, but the later I think are all about comparing signed and unsigned integers.
And lsmod shows no nvidia drivers registered, but the logs claims that Error: Driver 'nvidia' is already registered, aborting...
Anyone got any ideas?
mark
You don't say which version of the 340 series driver you have tried.
There was a bug with recent legacy releases that affected el7.4 kernels. We (elrepo) patched the driver to fix that on rhel7.4 releases. I'm not sure but it _may_ have been fixed in the 340.104 driver released last week - I've not bothered building it as the changelog only mentions "Improved compatibility with recent Linux kernels" which we patched/fixed in our the previous release and other issues which don't affect kmods on RHEL.
So it sounds like a known issue which has already been fixed. If you don't want to use our packages, maybe take a look at the patch and try applying it to your build.
Tested 340.76, 340.102, 340.104 (elrepo and proprietary). No luck over here with a GTX260 and the 64b-drivers.
Will test some more, if still no luck, I'll just reinstall from scratch.
The kmod-nvidia-340xx-340.102-4.el7_4.elrepo.x86_64.rpm driver should work for your card on el7.4.
All previous releases in elrepo were for el7.3 (and earlier) and are not compatible with the el7.4 series kernel.
-----Original Message----- From: CentOS [mailto:centos-bounces@centos.org] On Behalf Of Phil Perry Sent: den 27 september 2017 20:47 To: centos@centos.org Subject: Re: [CentOS] Semi-OT: hardware: NVidia proprietary driver, C7.4
There was a bug with recent legacy releases that affected el7.4
kernels.
We (elrepo) patched the driver to fix that on rhel7.4 releases. I'm not sure but it _may_ have been fixed in the 340.104 driver released last week - I've not bothered building it as the changelog only mentions "Improved compatibility with recent Linux kernels" which we patched/fixed in our the previous release and other issues which don't affect kmods on RHEL.
So it sounds like a known issue which has already been fixed. If you don't want to use our packages, maybe take a look at the patch and try applying it to your build.
Tested 340.76, 340.102, 340.104 (elrepo and proprietary). No luck over here with a GTX260 and the 64b-drivers.
Will test some more, if still no luck, I'll just reinstall from scratch.
The kmod-nvidia-340xx-340.102-4.el7_4.elrepo.x86_64.rpm driver should work for your card on el7.4.
All previous releases in elrepo were for el7.3 (and earlier) and are not compatible with the el7.4 series kernel.
My trouble-shooting yesterday just before I went home from work showed that it seemed to have been gdm that borked out for some reason. I've never had that happen to me, regardless of CentOS version. Installing lightdm brought everything backup as expected.
Has anybody else had gdm act up?
Weird in any case.
-- //Sorin
On Tue, 2017-09-26 at 13:40 -0400, m.roth@5-cent.us wrote:
This is really frustrating. I've got a server with two K20c Tesla cards. I need to use the proprietary drivers to use the CUDA toolkit. Btw, I had no trouble at all with building for CentOS 7.3
I have what NVidia claims is the correct driver package, a 340 series. It appears to build, but then fails to load. The only error I see is "no such device", which makes no sense to me, esp. since it says nothing whatever else.
I've gone through the install log, and there are a bunch of Note:, and warnings, but the later I think are all about comparing signed and unsigned integers.
And lsmod shows no nvidia drivers registered, but the logs claims that Error: Driver 'nvidia' is already registered, aborting...
Have you tried installing the toolkit from nVidia's own repository:
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=...
That includes the kernel drivers as far as I can remember.
P.
From my experience elrepo nvidia drivers work fine with CUDA packages from
nvidia repository
On Tue, Sep 26, 2017 at 10:31 PM, Pete Biggs pete@biggs.org.uk wrote:
On Tue, 2017-09-26 at 13:40 -0400, m.roth@5-cent.us wrote:
This is really frustrating. I've got a server with two K20c Tesla cards.
I
need to use the proprietary drivers to use the CUDA toolkit. Btw, I had
no
trouble at all with building for CentOS 7.3
I have what NVidia claims is the correct driver package, a 340 series. It appears to build, but then fails to load. The only error I see is "no
such
device", which makes no sense to me, esp. since it says nothing whatever else.
I've gone through the install log, and there are a bunch of Note:, and warnings, but the later I think are all about comparing signed and unsigned integers.
And lsmod shows no nvidia drivers registered, but the logs claims that Error: Driver 'nvidia' is already registered, aborting...
Have you tried installing the toolkit from nVidia's own repository:
https://developer.nvidia.com/cuda-downloads?target_os= Linux&target_arch=x86_64&target_distro=CentOS&target_ version=7&target_type=rpmnetwork
That includes the kernel drivers as far as I can remember.
P.
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos