Hi,
I have a dual core athlon server and it is gaining 1 day every 2 days w/o time sync. Even with ntpd running, the time is not under control. I must put a very frequent cronjob of 'ntpdate' to keep the time under control. This creates big problems since winbind eventually stops working so my users can't access their data.
Any ideas?
I have a dual core athlon server and it is gaining 1 day every 2 days
w/o time sync. Even with ntpd running, the time is not under control. I must put a very frequent cronjob of 'ntpdate' to keep the time under control. This creates big problems since winbind eventually stops working so my users can't access their data.
As I remember, this is an apic/apci (I get the two confused constantly) error. Sometimes a bios update fixes it. Sometimes you have to add some kernel options to grub, like noapci or noapic.. whichever one it is... Something like that. I'm sure someone here will correct me, but that should point you in the right general direction.
-- "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety'' Benjamin Franklin 1775
Quoting Ugo Bellavance ugob@camo-route.com:
Hi,
I have a dual core athlon server and it is gaining 1 day every 2 days w/o time sync. Even with ntpd running, the time is not under control. I must put a very frequent cronjob of 'ntpdate' to keep the time under control. This creates big problems since winbind eventually stops working so my users can't access their data.
Running under VMware by any chance? If yes read next paragraph, if not skip to the one after it (but you might still read both).
In current versions of VMware (for example ESX 2.5.x), 2.6 kernels are not yet officially supported. What you described is one of the problems with 2.6 kernels and VMware. Add "clock=pit" kernel option (in grub.conf or lilo.conf, whichever boot loader you use), don't use NTP to sync time, install vmware-tools onto each guest and enable time synchronization in them (by default it is off). It should keep time in your guests under some controll. The problem is mostly because 2.6 kernels are much stricter in watching the frequency source selected for clock, and they also increased the frequncy of interrupts requested from it from 100Hz to 1000Hz (one global + one per CPU, or something like that). This frequency is compile time kernel option (it is hard coded into the kernel, can't be changed once kernel is compiled). Furthermore, frequency of interrupts increases with number of processor cores (so if each of your guests is configured with two virtual CPUs, it's 3000 interrupts per second per 2.6 guest, compared to only 300 per 2.4 guest). With many guest running on bussy box, VMware might not be able to generate all needed virtual interrupts for 2.6 guest operating systems, and you get clock problems you are having. There's a code in clock code in 2.6 kernel that attempts to correct for missed/skipped interrupts. However under VMware it tends to overcorrect and your clock starts gaining time fast, like you described. This is classic problem you'll encounter with current versions of VMware and guests running 2.6 kernel. It should be corrected in Vmware ESX 3.x (which should also have official support for 2.6 kernels).
If you are not running VMware, you might still experiment with clock option (it selects the frequency source kernel uses to keep track of time). The default frequency source obviously doesn't work well for you. Available sources are pit, tsc, cyclone and pmtmr, however not all are available on all motherboards (you'd need to check what kind of timers your motherboard has). If specified source is not available (your motherboard doesn't have that hardware), kernel falls back to pit (or whatever kernel was patched to use by default). You may also try hpet=disable kernel option (with or without clock option), which disables HPET (if present on motherboard) and falls back to real PIT.
Even if not using VMware, you might find this document a good read:
http://www.vmware.com/pdf/vmware_timekeeping.pdf
It describes timer hardware available in average PCs (PC Timer Hardware chapter) and describes various clock=xxx options (Timekeeping in Specific Operating Systems chapter, Linux section).
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
Aleksandar Milivojevic wrote:
Quoting Ugo Bellavance ugob@camo-route.com:
Hi,
I have a dual core athlon server and it is gaining 1 day every 2 days
w/o time sync. Even with ntpd running, the time is not under control. I must put a very frequent cronjob of 'ntpdate' to keep the time under control. This creates big problems since winbind eventually stops working so my users can't access their data.
Running under VMware by any chance? If yes read next paragraph, if not skip to the one after it (but you might still read both).
In current versions of VMware (for example ESX 2.5.x), 2.6 kernels are not yet officially supported. What you described is one of the problems with 2.6 kernels and VMware. Add "clock=pit" kernel option (in grub.conf or lilo.conf, whichever boot loader you use), don't use NTP to sync time, install vmware-tools onto each guest and enable time synchronization in them (by default it is off). It should keep time in your guests under some controll. The problem is mostly because 2.6 kernels are much stricter in watching the frequency source selected for clock, and they also increased the frequncy of interrupts requested from it from 100Hz to 1000Hz (one global + one per CPU, or something like that). This frequency is compile time kernel option (it is hard coded into the kernel, can't be changed once kernel is compiled). Furthermore, frequency of interrupts increases with number of processor cores (so if each of your guests is configured with two virtual CPUs, it's 3000 interrupts per second per 2.6 guest, compared to only 300 per 2.4 guest). With many guest running on bussy box, VMware might not be able to generate all needed virtual interrupts for 2.6 guest operating systems, and you get clock problems you are having. There's a code in clock code in 2.6 kernel that attempts to correct for missed/skipped interrupts. However under VMware it tends to overcorrect and your clock starts gaining time fast, like you described. This is classic problem you'll encounter with current versions of VMware and guests running 2.6 kernel. It should be corrected in Vmware ESX 3.x (which should also have official support for 2.6 kernels).
If you are not running VMware, you might still experiment with clock option (it selects the frequency source kernel uses to keep track of time). The default frequency source obviously doesn't work well for you. Available sources are pit, tsc, cyclone and pmtmr, however not all are available on all motherboards (you'd need to check what kind of timers your motherboard has). If specified source is not available (your motherboard doesn't have that hardware), kernel falls back to pit (or whatever kernel was patched to use by default). You may also try hpet=disable kernel option (with or without clock option), which disables HPET (if present on motherboard) and falls back to real PIT.
Even if not using VMware, you might find this document a good read:
http://www.vmware.com/pdf/vmware_timekeeping.pdf
It describes timer hardware available in average PCs (PC Timer Hardware chapter) and describes various clock=xxx options (Timekeeping in Specific Operating Systems chapter, Linux section).
This message was sent using IMP, the Internet Messaging Program.
pmtmr did it ! :)
Thanks!