On Thu, 2011-04-14 at 13:28 +0200, Simon Matter wrote:
On 4/14/2011 6:47 AM, Johnny Hughes wrote:
Is it really true that the time is working perfectly with one of the other kernels (the older ones)?
Johnny, Yes, As long as I run the older 5.5 kernel my time is perfect.
All clients can get from this machine with no issues. As soon as I run new kernel, or Plus kernel for that matter. The time goes downhill. "Uphill actually"
To answer the previous question I do have the HW clock set to utc,
Everything is stock from initial install of the package.
Did you check dmesg which timer is being used (I think it can also be seen somewhere in /proc but I don't remember). If it's hpet, you could try to disable it. That was for i686: 'hpet=disable' and for x86_64: 'nohpet', don't know how it is with current kernels.
Simon
Forgive me if I've missed a later post but it looked like this thread was stagnant...
You may have something here Simon. I was thinking about your suggestion that it could be a timer issue. I'm wondering if the default clocksource or some related timer kernel parameter has been changed between 2.6.18-194.17.4.el5 (5.5) and 2.6.18-238.5.1.el5 (5.6).
Timer related issues could very well account for this large, inconsistent NTP drift as well as Florin Andrei's "bizarre" tar, scp, and NTP issues in the "[CentOS] bizarre system slowness" thread. System interrupts are based on the clocksource chosen by (or configured in) the kernel. Any service or facility that uses these interrupts could be experiencing problems.
Can anyone on the list confirm whether or not timer related kernel parameters have changed in 5.6? I don't have source handy and I'm going out the door in minutes.
Reading up on kernel timer options, I came across these articles.
# Discusses mis-detected timer frequency 9.2.4.2.7. Kernel 2.6 Mis-Detecting CPU TSC Frequency http://support.ntp.org/bin/view/Support/KnownOsIssues#Section_9.2.4.2.7.
# Describes ntpd instability from some time sources # Includes data and graphs from detailed study http://www.ep.ph.bham.ac.uk/general/support/adjtimex.html
I checked clock sources on a few systems under my control to see what came up. None are experiencing this problem. The CentOS and FC12 machines are isolated from the Internet while the FC14 laptop connects. My sample CentOS 5.5 & 5.6 systems are different hardware platforms. The 5.6 box doesn't have the hpet timer available so it may just not be susceptible to this problem. I'll be updating the 5.5 sample to 5.6 tomorrow which does have hpet available so I should know something more then.
# Used these to get available and current clocksource: cat /sys/devices/system/clocksource/clocksource0/available_clocksource cat /sys/devices/system/clocksource/clocksource0/current_clocksource
# CentOS 5.5: Available: acpi_pm jiffies hpet tsc pit Current: tsc
# CentOS 5.6: Available: acpi_pm jiffies tsc pit Current: tsc
# Fedora 12: Available: tsc hpet acpi_pm Current: tsc
# Fedora 14: Using hpet Available: hpet acpi_pm Current: hpet