[CentOS] Nvidia maximum pixel clock issue in kmod-nvidia-384.98

Wed Jan 3 19:43:04 UTC 2018
Phil Perry <pperry at elrepo.org>

On 03/01/18 15:45, Danny Smit wrote:
> Hi everyone,
> 
> On CentOS 7 I'm running into an issue with the latest nvidia driver
> from elrepo: kmod-nvidia-384.98-1.el7_4.elrepo.x86_64
> This driver version seem to introduce issue in detecting video modes
> when a monitor is connected using DVI. As soon as the machine attempts
> to start X, nothing happens and the monitor goes into sleep mode
> reporting that it has 'no signal'.
> 
> It is interesting because it occurs with several monitors, but only
> when they are connected through DVI. When using VGA it works like a
> charm.
> Also it did work without problems with the previous driver version
> (kmod-nvidia-384.90-1.el7_4.elrepo.x86_64), with the exact same setup.
> 
> In this case the issue occurs with an Nvidia NVS 315. I don't know if
> other nvidia cards are affected.
> 
> The following information from Xorg.0.log seems of interest:
> 
>      [   130.447] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:1:0:0
>      [   130.447] (--) NVIDIA(0):     CRT-0
>      [   130.447] (--) NVIDIA(0):     CRT-1
>      [   130.447] (--) NVIDIA(0):     DFP-0 (boot)
>      [   130.447] (--) NVIDIA(0):     DFP-1
>      [   130.447] (--) NVIDIA(0):     DFP-2
>      [   130.447] (--) NVIDIA(0):     DFP-3
>      [   130.448] (II) NVIDIA(0): NVIDIA GPU NVS 315 (GF119) at PCI:1:0:0 (GPU-0)
>      [   130.448] (--) NVIDIA(0): Memory: 1048576 kBytes
>      [   130.448] (--) NVIDIA(0): VideoBIOS: 75.19.88.00.0b
>      [   130.448] (II) NVIDIA(0): Detected PCI Express Link width: 16X
>      [   130.462] (--) NVIDIA(GPU-0): CRT-0: disconnected
>      [   130.462] (--) NVIDIA(GPU-0): CRT-0: 400.0 MHz maximum pixel clock
>      [   130.462] (--) NVIDIA(GPU-0):
>      [   130.467] (--) NVIDIA(GPU-0): CRT-1: disconnected
>      [   130.467] (--) NVIDIA(GPU-0): CRT-1: 400.0 MHz maximum pixel clock
>      [   130.467] (--) NVIDIA(GPU-0):
>      [   130.488] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): connected
>      [   130.488] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): Internal TMDS
>      [   130.488] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): 0.0 MHz
> maximum pixel clock
>      [   130.488] (--) NVIDIA(GPU-0):
>      [   130.493] (--) NVIDIA(GPU-0): DFP-1: disconnected
>      [   130.493] (--) NVIDIA(GPU-0): DFP-1: Internal TMDS
>      [   130.493] (--) NVIDIA(GPU-0): DFP-1: 0.0 MHz maximum pixel clock
>      [   130.493] (--) NVIDIA(GPU-0):
>      [   130.493] (--) NVIDIA(GPU-0): DFP-2: disconnected
>      [   130.493] (--) NVIDIA(GPU-0): DFP-2: Internal DisplayPort
>      [   130.493] (--) NVIDIA(GPU-0): DFP-2: 480.0 MHz maximum pixel clock
>      [   130.493] (--) NVIDIA(GPU-0):
>      [   130.493] (--) NVIDIA(GPU-0): DFP-3: disconnected
>      [   130.493] (--) NVIDIA(GPU-0): DFP-3: Internal DisplayPort
>      [   130.493] (--) NVIDIA(GPU-0): DFP-3: 480.0 MHz maximum pixel clock
>      [   130.493] (--) NVIDIA(GPU-0):
>      [   130.493] (EE) NVIDIA(GPU-0): Unable to add conservative
> default mode "nvidia-auto-select".
>      [   130.493] (EE) NVIDIA(GPU-0): Unable to add
> "nvidia-auto-select" mode to ModePool.
>      [   130.493] (==) NVIDIA(0):
>      [   130.493] (==) NVIDIA(0): No modes were requested; the default
> mode "nvidia-auto-select"
>      [   130.493] (==) NVIDIA(0):     will be used as the requested mode.
>      [   130.493] (==) NVIDIA(0):
>      [   130.493] (WW) NVIDIA(0): No valid modes for
> "DFP-0:nvidia-auto-select"; removing.
>      [   130.493] (WW) NVIDIA(0):
>      [   130.493] (WW) NVIDIA(0): Unable to validate any modes; falling
> back to the default mode
>      [   130.493] (WW) NVIDIA(0):     "nvidia-auto-select".
>      [   130.493] (WW) NVIDIA(0):
>      [   130.493] (WW) NVIDIA(0): No valid modes for
> "DFP-0:nvidia-auto-select"; removing.
>      [   130.493] (EE) NVIDIA(0): Unable to use default mode
> "nvidia-auto-select".
>      [   130.493] (EE) NVIDIA(0): Failing initialization of X screen 0
> 
> 
> The log shows "0.0 MHz maximum pixel clock" for the DVI connections.
> When enabling ModeDebug in the xorg.conf it shows that all of the
> resolutions are rejected because of incorrect maximum pixel clock:
> 
>      [  1354.589] (II) NVIDIA(GPU-0):
>      [  1354.589] (II) NVIDIA(GPU-0): --- Building ModePool for Philips
> 240S4 (DFP-0) ---
>      [  1354.589] (WW) NVIDIA(GPU-0):   Validating Mode "1920x1200_60":
>      [  1354.589] (WW) NVIDIA(GPU-0):     Mode Source: EDID
>      [  1354.589] (WW) NVIDIA(GPU-0):     1920 x 1200 @ 60 Hz
>      [  1354.589] (WW) NVIDIA(GPU-0):       Pixel Clock      : 154.00 MHz
>      [  1354.589] (WW) NVIDIA(GPU-0):       HRes, HSyncStart : 1920, 1968
>      [  1354.589] (WW) NVIDIA(GPU-0):       HSyncEnd, HTotal : 2000, 2080
>      [  1354.589] (WW) NVIDIA(GPU-0):       VRes, VSyncStart : 1200, 1203
>      [  1354.589] (WW) NVIDIA(GPU-0):       VSyncEnd, VTotal : 1209, 1235
>      [  1354.589] (WW) NVIDIA(GPU-0):       Sync Polarity    : +H -V
>      [  1354.589] (WW) NVIDIA(GPU-0):     Mode is rejected: PixelClock
> (154.0 MHz) too high for
>      [  1354.589] (WW) NVIDIA(GPU-0):     Display Device (Max: 0.0 MHz).
>      [  1354.589] (WW) NVIDIA(GPU-0):     Mode "1920x1200_60" is invalid.
> 
> 
> Note that with the previous driver (version 384.90), a valid value was
> detected for the 'maximum pixel clock':
> 
>      [   124.804] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): connected
>      [   124.804] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): Internal TMDS
>      [   124.804] (--) NVIDIA(GPU-0): Philips 240S4 (DFP-0): 165.0 MHz
> maximum pixel clock
> 
> 
> Since it did work with version 384.90 and stopped working with version
> 384.98, it looks like a regression in the nvidia driver. What is the
> best way to report the issue and get an update in elrepo?
> 
> 
> Kind regards,
> 
> Danny Smit

Hi Danny,

First step would be to confirm it is a regression in the NVIDIA driver, 
and not an issue introduced by elrepo packaging. The best way to do that 
would be to test by uninstalling the elrepo package and then installing 
the driver direct from NVIDIA to test. Assuming you find the same issue 
then I think we can conclude the issue is with the driver so you will 
need to report the issue upstream with NVIDIA. Once fixed upstream, 
elrepo can package the fixed driver.

Please feel free to also file a bug report with elrepo for tracking 
purposes and include details of any upstream bug reports.