Hi everyone,
On CentOS 7 I'm running into an issue with the latest nvidia driver from elrepo: kmod-nvidia-384.98-1.el7_4.elrepo.x86_64 This driver version seem to introduce issue in detecting video modes when a monitor is connected using DVI. As soon as the machine attempts to start X, nothing happens and the monitor goes into sleep mode reporting that it has 'no signal'.
It is interesting because it occurs with several monitors, but only when they are connected through DVI. When using VGA it works like a charm. Also it did work without problems with the previous driver version (kmod-nvidia-384.90-1.el7_4.elrepo.x86_64), with the exact same setup.
In this case the issue occurs with an Nvidia NVS 315. I don't know if other nvidia cards are affected.
The following information from Xorg.0.log seems of interest:
[ 130.447] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:1:0:0 [ 130.447] (--) NVIDIA(0): CRT-0 [ 130.447] (--) NVIDIA(0): CRT-1 [ 130.447] (--) NVIDIA(0): DFP-0 (boot) [ 130.447] (--) NVIDIA(0): DFP-1 [ 130.447] (--) NVIDIA(0): DFP-2 [ 130.447] (--) NVIDIA(0): DFP-3 [ 130.448] (II) NVIDIA(0): NVIDIA GPU NVS 315 (GF119) at PCI:1:0:0 (GPU-0) [ 130.448] (--) NVIDIA(0): Memory: 1048576 kBytes [ 130.448] (--) NVIDIA(0): VideoBIOS: 75.19.88.00.0b [ 130.448] (II) NVIDIA(0): Detected PCI Express Link width: 16X [ 130.462] (--) NVIDIA(GPU-0): CRT-0: disconnected [ 130.462] (--) NVIDIA(GPU-0): CRT-0: 400.0 MHz maximum pixel clock [ 130.462] (--) NVIDIA(GPU-0): [ 130.467] (--) NVIDIA(GPU-0): CRT-1: disconnected [ 130.467] (--) NVIDIA(GPU-0): CRT-1: 400.0 MHz maximum pixel clock [ 130.467] (--) NVIDIA(GPU-0): [ 130.488] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): connected [ 130.488] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): Internal TMDS [ 130.488] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): 0.0 MHz maximum pixel clock [ 130.488] (--) NVIDIA(GPU-0): [ 130.493] (--) NVIDIA(GPU-0): DFP-1: disconnected [ 130.493] (--) NVIDIA(GPU-0): DFP-1: Internal TMDS [ 130.493] (--) NVIDIA(GPU-0): DFP-1: 0.0 MHz maximum pixel clock [ 130.493] (--) NVIDIA(GPU-0): [ 130.493] (--) NVIDIA(GPU-0): DFP-2: disconnected [ 130.493] (--) NVIDIA(GPU-0): DFP-2: Internal DisplayPort [ 130.493] (--) NVIDIA(GPU-0): DFP-2: 480.0 MHz maximum pixel clock [ 130.493] (--) NVIDIA(GPU-0): [ 130.493] (--) NVIDIA(GPU-0): DFP-3: disconnected [ 130.493] (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort [ 130.493] (--) NVIDIA(GPU-0): DFP-3: 480.0 MHz maximum pixel clock [ 130.493] (--) NVIDIA(GPU-0): [ 130.493] (EE) NVIDIA(GPU-0): Unable to add conservative default mode "nvidia-auto-select". [ 130.493] (EE) NVIDIA(GPU-0): Unable to add "nvidia-auto-select" mode to ModePool. [ 130.493] (==) NVIDIA(0): [ 130.493] (==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select" [ 130.493] (==) NVIDIA(0): will be used as the requested mode. [ 130.493] (==) NVIDIA(0): [ 130.493] (WW) NVIDIA(0): No valid modes for "DFP-0:nvidia-auto-select"; removing. [ 130.493] (WW) NVIDIA(0): [ 130.493] (WW) NVIDIA(0): Unable to validate any modes; falling back to the default mode [ 130.493] (WW) NVIDIA(0): "nvidia-auto-select". [ 130.493] (WW) NVIDIA(0): [ 130.493] (WW) NVIDIA(0): No valid modes for "DFP-0:nvidia-auto-select"; removing. [ 130.493] (EE) NVIDIA(0): Unable to use default mode "nvidia-auto-select". [ 130.493] (EE) NVIDIA(0): Failing initialization of X screen 0
The log shows "0.0 MHz maximum pixel clock" for the DVI connections. When enabling ModeDebug in the xorg.conf it shows that all of the resolutions are rejected because of incorrect maximum pixel clock:
[ 1354.589] (II) NVIDIA(GPU-0): [ 1354.589] (II) NVIDIA(GPU-0): --- Building ModePool for Philips 240S4 (DFP-0) --- [ 1354.589] (WW) NVIDIA(GPU-0): Validating Mode "1920x1200_60": [ 1354.589] (WW) NVIDIA(GPU-0): Mode Source: EDID [ 1354.589] (WW) NVIDIA(GPU-0): 1920 x 1200 @ 60 Hz [ 1354.589] (WW) NVIDIA(GPU-0): Pixel Clock : 154.00 MHz [ 1354.589] (WW) NVIDIA(GPU-0): HRes, HSyncStart : 1920, 1968 [ 1354.589] (WW) NVIDIA(GPU-0): HSyncEnd, HTotal : 2000, 2080 [ 1354.589] (WW) NVIDIA(GPU-0): VRes, VSyncStart : 1200, 1203 [ 1354.589] (WW) NVIDIA(GPU-0): VSyncEnd, VTotal : 1209, 1235 [ 1354.589] (WW) NVIDIA(GPU-0): Sync Polarity : +H -V [ 1354.589] (WW) NVIDIA(GPU-0): Mode is rejected: PixelClock (154.0 MHz) too high for [ 1354.589] (WW) NVIDIA(GPU-0): Display Device (Max: 0.0 MHz). [ 1354.589] (WW) NVIDIA(GPU-0): Mode "1920x1200_60" is invalid.
Note that with the previous driver (version 384.90), a valid value was detected for the 'maximum pixel clock':
[ 124.804] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): connected [ 124.804] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): Internal TMDS [ 124.804] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): 165.0 MHz maximum pixel clock
Since it did work with version 384.90 and stopped working with version 384.98, it looks like a regression in the nvidia driver. What is the best way to report the issue and get an update in elrepo?
Kind regards,
Danny Smit
On 03/01/18 15:45, Danny Smit wrote:
Hi everyone,
On CentOS 7 I'm running into an issue with the latest nvidia driver from elrepo: kmod-nvidia-384.98-1.el7_4.elrepo.x86_64 This driver version seem to introduce issue in detecting video modes when a monitor is connected using DVI. As soon as the machine attempts to start X, nothing happens and the monitor goes into sleep mode reporting that it has 'no signal'.
It is interesting because it occurs with several monitors, but only when they are connected through DVI. When using VGA it works like a charm. Also it did work without problems with the previous driver version (kmod-nvidia-384.90-1.el7_4.elrepo.x86_64), with the exact same setup.
In this case the issue occurs with an Nvidia NVS 315. I don't know if other nvidia cards are affected.
The following information from Xorg.0.log seems of interest:
[ 130.447] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:1:0:0 [ 130.447] (--) NVIDIA(0): CRT-0 [ 130.447] (--) NVIDIA(0): CRT-1 [ 130.447] (--) NVIDIA(0): DFP-0 (boot) [ 130.447] (--) NVIDIA(0): DFP-1 [ 130.447] (--) NVIDIA(0): DFP-2 [ 130.447] (--) NVIDIA(0): DFP-3 [ 130.448] (II) NVIDIA(0): NVIDIA GPU NVS 315 (GF119) at PCI:1:0:0 (GPU-0) [ 130.448] (--) NVIDIA(0): Memory: 1048576 kBytes [ 130.448] (--) NVIDIA(0): VideoBIOS: 75.19.88.00.0b [ 130.448] (II) NVIDIA(0): Detected PCI Express Link width: 16X [ 130.462] (--) NVIDIA(GPU-0): CRT-0: disconnected [ 130.462] (--) NVIDIA(GPU-0): CRT-0: 400.0 MHz maximum pixel clock [ 130.462] (--) NVIDIA(GPU-0): [ 130.467] (--) NVIDIA(GPU-0): CRT-1: disconnected [ 130.467] (--) NVIDIA(GPU-0): CRT-1: 400.0 MHz maximum pixel clock [ 130.467] (--) NVIDIA(GPU-0): [ 130.488] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): connected [ 130.488] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): Internal TMDS [ 130.488] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): 0.0 MHz
maximum pixel clock [ 130.488] (--) NVIDIA(GPU-0): [ 130.493] (--) NVIDIA(GPU-0): DFP-1: disconnected [ 130.493] (--) NVIDIA(GPU-0): DFP-1: Internal TMDS [ 130.493] (--) NVIDIA(GPU-0): DFP-1: 0.0 MHz maximum pixel clock [ 130.493] (--) NVIDIA(GPU-0): [ 130.493] (--) NVIDIA(GPU-0): DFP-2: disconnected [ 130.493] (--) NVIDIA(GPU-0): DFP-2: Internal DisplayPort [ 130.493] (--) NVIDIA(GPU-0): DFP-2: 480.0 MHz maximum pixel clock [ 130.493] (--) NVIDIA(GPU-0): [ 130.493] (--) NVIDIA(GPU-0): DFP-3: disconnected [ 130.493] (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort [ 130.493] (--) NVIDIA(GPU-0): DFP-3: 480.0 MHz maximum pixel clock [ 130.493] (--) NVIDIA(GPU-0): [ 130.493] (EE) NVIDIA(GPU-0): Unable to add conservative default mode "nvidia-auto-select". [ 130.493] (EE) NVIDIA(GPU-0): Unable to add "nvidia-auto-select" mode to ModePool. [ 130.493] (==) NVIDIA(0): [ 130.493] (==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select" [ 130.493] (==) NVIDIA(0): will be used as the requested mode. [ 130.493] (==) NVIDIA(0): [ 130.493] (WW) NVIDIA(0): No valid modes for "DFP-0:nvidia-auto-select"; removing. [ 130.493] (WW) NVIDIA(0): [ 130.493] (WW) NVIDIA(0): Unable to validate any modes; falling back to the default mode [ 130.493] (WW) NVIDIA(0): "nvidia-auto-select". [ 130.493] (WW) NVIDIA(0): [ 130.493] (WW) NVIDIA(0): No valid modes for "DFP-0:nvidia-auto-select"; removing. [ 130.493] (EE) NVIDIA(0): Unable to use default mode "nvidia-auto-select". [ 130.493] (EE) NVIDIA(0): Failing initialization of X screen 0
The log shows "0.0 MHz maximum pixel clock" for the DVI connections. When enabling ModeDebug in the xorg.conf it shows that all of the resolutions are rejected because of incorrect maximum pixel clock:
[ 1354.589] (II) NVIDIA(GPU-0): [ 1354.589] (II) NVIDIA(GPU-0): --- Building ModePool for Philips
240S4 (DFP-0) --- [ 1354.589] (WW) NVIDIA(GPU-0): Validating Mode "1920x1200_60": [ 1354.589] (WW) NVIDIA(GPU-0): Mode Source: EDID [ 1354.589] (WW) NVIDIA(GPU-0): 1920 x 1200 @ 60 Hz [ 1354.589] (WW) NVIDIA(GPU-0): Pixel Clock : 154.00 MHz [ 1354.589] (WW) NVIDIA(GPU-0): HRes, HSyncStart : 1920, 1968 [ 1354.589] (WW) NVIDIA(GPU-0): HSyncEnd, HTotal : 2000, 2080 [ 1354.589] (WW) NVIDIA(GPU-0): VRes, VSyncStart : 1200, 1203 [ 1354.589] (WW) NVIDIA(GPU-0): VSyncEnd, VTotal : 1209, 1235 [ 1354.589] (WW) NVIDIA(GPU-0): Sync Polarity : +H -V [ 1354.589] (WW) NVIDIA(GPU-0): Mode is rejected: PixelClock (154.0 MHz) too high for [ 1354.589] (WW) NVIDIA(GPU-0): Display Device (Max: 0.0 MHz). [ 1354.589] (WW) NVIDIA(GPU-0): Mode "1920x1200_60" is invalid.
Note that with the previous driver (version 384.90), a valid value was detected for the 'maximum pixel clock':
[ 124.804] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): connected [ 124.804] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): Internal TMDS [ 124.804] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): 165.0 MHz
maximum pixel clock
Since it did work with version 384.90 and stopped working with version 384.98, it looks like a regression in the nvidia driver. What is the best way to report the issue and get an update in elrepo?
Kind regards,
Danny Smit
Hi Danny,
First step would be to confirm it is a regression in the NVIDIA driver, and not an issue introduced by elrepo packaging. The best way to do that would be to test by uninstalling the elrepo package and then installing the driver direct from NVIDIA to test. Assuming you find the same issue then I think we can conclude the issue is with the driver so you will need to report the issue upstream with NVIDIA. Once fixed upstream, elrepo can package the fixed driver.
Please feel free to also file a bug report with elrepo for tracking purposes and include details of any upstream bug reports.
On Wed Jan 03 07:43:04 PM, Phil Perry wrote:
On CentOS 7 I'm running into an issue with the latest nvidia driver from elrepo: kmod-nvidia-384.98-1.el7_4.elrepo.x86_64 This driver version seem to introduce issue in detecting video modes when a monitor is connected using DVI. As soon as the machine attempts to start X, nothing happens and the monitor goes into sleep mode reporting that it has 'no signal'.
It is interesting because it occurs with several monitors, but only when they are connected through DVI. When using VGA it works like a charm. Also it did work without problems with the previous driver version (kmod-nvidia-384.90-1.el7_4.elrepo.x86_64), with the exact same setup.
In this case the issue occurs with an Nvidia NVS 315. I don't know if other nvidia cards are affected.
I can confirm that this happens with the driver downloaded from NVIDIA. I had to fall back to the .90 driver to get it to work for all my NVS 315s (with dual DVI) running on 7.4 / 6.9.
Cheers, Zube
On 03/01/18 20:14, Zube wrote:
On Wed Jan 03 07:43:04 PM, Phil Perry wrote:
On CentOS 7 I'm running into an issue with the latest nvidia driver from elrepo: kmod-nvidia-384.98-1.el7_4.elrepo.x86_64 This driver version seem to introduce issue in detecting video modes when a monitor is connected using DVI. As soon as the machine attempts to start X, nothing happens and the monitor goes into sleep mode reporting that it has 'no signal'.
It is interesting because it occurs with several monitors, but only when they are connected through DVI. When using VGA it works like a charm. Also it did work without problems with the previous driver version (kmod-nvidia-384.90-1.el7_4.elrepo.x86_64), with the exact same setup.
In this case the issue occurs with an Nvidia NVS 315. I don't know if other nvidia cards are affected.
I can confirm that this happens with the driver downloaded from NVIDIA. I had to fall back to the .90 driver to get it to work for all my NVS 315s (with dual DVI) running on 7.4 / 6.9.
Thank you for the confirmation. I found a similar report on debian also relating to NVS315 with the 384.98 driver so I'm guessing this is hardware specific:
https://www.mail-archive.com/debian-bugs-dist@lists.debian.org/msg1570324.ht...
I couldn't find any reports upstream at nvidia so am unsure if they are aware of the issue. For reference, my GK208 [GeForce GT 730] in my test system is unaffected by the issue and is working fine with the 384.98 driver over DVI.
There is an updated version 387.34 short-lived branch driver available in the elrepo testing repository that you could test to see whether the issue has been fixed in this latest release (only available for el7 currently).
On Thu, January 4, 2018 05:22, Phil Perry wrote:
I couldn't find any reports upstream at nvidia so am unsure if they are aware of the issue. For reference, my GK208 [GeForce GT 730] in my test system is unaffected by the issue and is working fine with the 384.98 driver over DVI.
keep in mind that an nVIDIA GeForce is a consumer graphics device and the NVS xxx series device is a business product similar to the nVIDIA Quadro ... (with Windows they have different video drivers ...)
On Thu, Jan 4, 2018 at 5:22 AM, Phil Perry pperry@elrepo.org wrote:
On 03/01/18 20:14, Zube wrote:
I can confirm that this happens with the driver downloaded from NVIDIA. I had to fall back to the .90 driver to get it to work for all my NVS 315s (with dual DVI) running on 7.4 / 6.9.
Thanks. For reference, I did the same test. With the driver downloaded from NVIDIA the issues also occurs in my case.
I couldn't find any reports upstream at nvidia so am unsure if they are aware of the issue.
Where do you look for this at nvidia, in the community forums? Or is there another publicly available bug tracking system? (which I was unable to find)
There is an updated version 387.34 short-lived branch driver available in the elrepo testing repository that you could test to see whether the issue has been fixed in this latest release (only available for el7 currently).
Surprisingly, I couldn't reproduce the issue anymore. Therefore at first sight it seems to be fixed in the 387.34 driver.
I posted a question at nvidia anyway: https://devtalk.nvidia.com/default/topic/1028268/linux/nvidia-maximum-pixel-... (although I'm not sure it is the right place, still having some issues finding my way at nvidia.com, other suggestions or directions are welcome of course)
Will short-lived drivers ever make it into the official elrepo repository? Or will only the long-lived drivers be included in the official repository?
On 04/01/18 09:12, Danny Smit wrote:
On Thu, Jan 4, 2018 at 5:22 AM, Phil Perry pperry@elrepo.org wrote:
On 03/01/18 20:14, Zube wrote:
I can confirm that this happens with the driver downloaded from NVIDIA. I had to fall back to the .90 driver to get it to work for all my NVS 315s (with dual DVI) running on 7.4 / 6.9.
Thanks. For reference, I did the same test. With the driver downloaded from NVIDIA the issues also occurs in my case.
I couldn't find any reports upstream at nvidia so am unsure if they are aware of the issue.
Where do you look for this at nvidia, in the community forums? Or is there another publicly available bug tracking system? (which I was unable to find)
I would start by posting in the forums as you did (below). I'm not aware of an official bug tracker either.
There is an updated version 387.34 short-lived branch driver available in the elrepo testing repository that you could test to see whether the issue has been fixed in this latest release (only available for el7 currently).
Surprisingly, I couldn't reproduce the issue anymore. Therefore at first sight it seems to be fixed in the 387.34 driver.
Excellent. Hopefully sounds like the issue may already have been fixed.
I posted a question at nvidia anyway: https://devtalk.nvidia.com/default/topic/1028268/linux/nvidia-maximum-pixel-... (although I'm not sure it is the right place, still having some issues finding my way at nvidia.com, other suggestions or directions are welcome of course)
Yes, I would have posted to the same place.
Will short-lived drivers ever make it into the official elrepo repository? Or will only the long-lived drivers be included in the official repository?
Normally elrepo only releases the long term branch for Enterprise Linux, on the assumption EL users will welcome the implied stability over more frequent and potentially buggy releases.
In this case I had built the current short lived release as a user requested it for compatibility with the latest CUDA. However, as it is a short term branch release, it will stay in the testing repository indefinitely and will not be promoted to the main repository. Once it's been superseded by a subsequent long term branch release I will likely just delete it from the testing repo. That said, it should be fine to use (at your own risk).
On Thu, Jan 4, 2018 at 9:40 PM, Phil Perry pperry@elrepo.org wrote:
Normally elrepo only releases the long term branch for Enterprise Linux, on the assumption EL users will welcome the implied stability over more frequent and potentially buggy releases.
In this case I had built the current short lived release as a user requested it for compatibility with the latest CUDA. However, as it is a short term branch release, it will stay in the testing repository indefinitely and will not be promoted to the main repository. Once it's been superseded by a subsequent long term branch release I will likely just delete it from the testing repo. That said, it should be fine to use (at your own risk).
Thanks,
I normally certainly prefer the stability of the long-lived releases. I noticed an update was just released of the long-lived release: 384.111. The notes for this release say:
Fixed a regression that prevented displays connected via some types of passive adapters (e.g. DMS-59 to VGA or DVI) from working correctly. The regression was introduced with driver version 384.98.
That sounds very much like my issue. I will verify next Monday if that solves it for me. Can I assume that new 384.111 release will make it into the main elrepo repository eventually?
On Sat Jan 06 12:27:22 PM, Danny Smit wrote:
I normally certainly prefer the stability of the long-lived releases. I noticed an update was just released of the long-lived release: 384.111. The notes for this release say:
Fixed a regression that prevented displays connected via some types of passive adapters (e.g. DMS-59 to VGA or DVI) from working correctly. The regression was introduced with driver version 384.98.
That sounds very much like my issue. I will verify next Monday if that solves it for me. Can I assume that new 384.111 release will make it into the main elrepo repository eventually?
I can confirm the 384.111 version fixes the DVI problems with the NVS-315.
Cheers, Zube
On 06/01/18 12:00, Zube wrote:
On Sat Jan 06 12:27:22 PM, Danny Smit wrote:
I normally certainly prefer the stability of the long-lived releases. I noticed an update was just released of the long-lived release: 384.111. The notes for this release say:
Fixed a regression that prevented displays connected via some types of passive adapters (e.g. DMS-59 to VGA or DVI) from working correctly. The regression was introduced with driver version 384.98.
That sounds very much like my issue. I will verify next Monday if that solves it for me. Can I assume that new 384.111 release will make it into the main elrepo repository eventually?
I can confirm the 384.111 version fixes the DVI problems with the NVS-315.
Cheers, Zube
I've just built and released 384.111 to the elrepo main repository, so it should show up on the mirrors shortly.
Thanks for the confirmation it fixes the issue. I've just updated my local machine, which was unaffected by this issue, ran a few quick tests (glmark2) and can confirm 384.11 looks fine on my hardware.
Just keep in mind, this will be a package 'downgrade' from version 387.34 in the elrepo testing repository due to the lower version number.
On Sat, Jan 6, 2018 at 2:27 PM, Phil Perry pperry@elrepo.org wrote:
I've just built and released 384.111 to the elrepo main repository, so it should show up on the mirrors shortly.
As expected, the 384.111 also solves the problem in my case.
Thanks for the support everyone.
Regards, Danny
On 08/01/18 10:22, Danny Smit wrote:
On Sat, Jan 6, 2018 at 2:27 PM, Phil Perry pperry@elrepo.org wrote:
I've just built and released 384.111 to the elrepo main repository, so it should show up on the mirrors shortly.
As expected, the 384.111 also solves the problem in my case.
Thanks for the support everyone.
Thanks for the feedback Danny - glad we got it resolved.