I've been playing and comparing frequency scaling between AMD and Intel CPUs yesterday and there seem to be great differences between AMD and Intel and some gotchas. This is all on CentOS 5.2 with latest Xen kernels (which are supposed to be powersaving-enabled since 5.2).
AMD: It seems once I get the AMD CPU to use the ondemand governor it works very well and very efficiently. But this is not set by default, one has to set ondemand explicitely in /etc/sysconfig/cpuspeed and run cpuspeed on bootup. Otherwise it just knows about userspace and performance and defaults to performance.
Intel: On the other side ondemand is on for Intel CPUs automatically, but it doesn't seem to work. No matter if I run cpuspeed or not the current frequency is shown as 2000000. This is the scaling_min_freq for both CPUs I checked. The scaling_max_freq is 2.333/2.5. One is a Xeon Dual Core, one a Xeon Quad Core.
With "not seem to work" for Intel I mean it doesn't act on demand as it should. I tested by gzipping and gunzipping a 4 GB image file. I used top to observe CPU utilization. With a dual core the idle percentage stays around 50% for a while and only goes below this threshold near the end of the operation. I deduce that means that gzip can make use of only one core and only when it comes to writing to disk or using other external tools it can utilize more CPU power because that task is taken over to the other core. Same observation with quad core (and 75%). There is no difference between AMD and Intel in this respect, but the task seems to run a bit more efficiently on an AMD CPU - e.g. it is able to max out idle at 0% at least for short periods while this is almost impossible to observe with the Intel CPUs. The timing also shows that the AMD is the fastest one. (The AMD, a very new low voltage X2, also runs at 2500 max.)
My question: why don't the Intel CPUs don't scale up on demand? Could there be a bug in the driver that it measures overall utilization (which is at 50% most of the time) and not single core utilization, thus never reaching the threshold for scaling up? up_threshold is at 80 for both CPUs. (cat /sys/devices/system/cpu/cpu0/cpufreq/ondemand/up_threshold)
I have a somewhat related question. That very new AMD CPU mentioned above was not recognized by CentOS 5.2 and the current frequency was shown as 800000 (instead of 2500000), although it was running in full speed. The latest kernel corrected this. It's still unknown, but the frequency is now calculated correctly and thus frequency scaling works now (it didn't work when it was miscalculating at 800000). On the other hand, I have an older low-voltage AMD CPU (probably about 2 years on the market) that is recognized as X2 3800+ but frequency scaling fails because it miscalculates the current speed to 800 MHz as well. Is there anything I can do about that? Where could I check whether this CPU should be supported in full and frequency scaling working? (I'm not sure, but I think it may have actually worked when it was running in a different motherboard.)
Kai
Kai Schaetzl wrote:
I've been playing and comparing frequency scaling between AMD and Intel CPUs yesterday and there seem to be great differences between AMD and Intel and some gotchas. This is all on CentOS 5.2 with latest Xen kernels (which are supposed to be powersaving-enabled since 5.2).
I was looking at cpu frequency scaling a week or two ago on an Intel Q6600 quad core cpu. Rather than repeat myself, there are some numbers in this thread (post #6 amongst others):
http://www.centos.org/modules/newbb/viewtopic.php?topic_id=15484&forum=3...
Bottom line - the power saving between having frequency scaling enabled or not was surprisingly small (only 2-3W). It would appear that these processors are already fairly efficient at idle and scaling down the frequency adds little to the overall savings that may be obtained.
Ned Slider wrote on Sun, 03 Aug 2008 15:09:39 +0100:
http://www.centos.org/modules/newbb/viewtopic.php?topic_id=15484&forum=3...
Thanks for the URL, see below!
Bottom line - the power saving between having frequency scaling enabled or not was surprisingly small (only 2-3W). It would appear that these processors are already fairly efficient at idle and scaling down the frequency adds little to the overall savings that may be obtained.
I disagree about the reason. I think they are actually not so efficient. At least not if I compare to a low-voltage CPU. 105 W is a lot, latest AMD quad core low-voltage are at 50W. Did you check core temperature in the two scaling states? It makes a huge difference for me on the AMD (which is allowed to drop from 2500 to 1000). It drops from an already low value (30 and 22 Celsius) by more than 10 degrees. The second core always shows the lowest temperature (puzzle?) and it goes down to 6-8 (!) Celsius in idle state with 1000.) I think this will also result on some more substantial savings in Watt consumption. Even, if not, a substantially lower temperature like this is good for a long life of all parts, anyway.
I read that thread and am puzzled by acpi-cpufreq being loaded on your machine. If I modprobe it I get an error "device busy". Which makes sense to me as cpufreq_ondemand (which loaded automatically) should have already taken over. I see that behavior on all machines, no matter if Intel or AMD.
From my research yesterday it also looks like use of acpi-cpufreq is
somewhat "older" and should not be necessary at all for newer CPUs. So, it should be cpufreq_ondemand alone that does the scaling on your machine. Can you confirm that? I also wonder if your machine actually scales up. You listed the output in low/idle state. As I wrote I get the same, just at another level (they probably think Xeon's will be active all the time, anyway, so they allow them to drop not so much). Did you check that the frequency actually goes up to 2400 under load?
Kai
Kai Schaetzl wrote:
Ned Slider wrote on Sun, 03 Aug 2008 15:09:39 +0100:
http://www.centos.org/modules/newbb/viewtopic.php?topic_id=15484&forum=3...
Thanks for the URL, see below!
Bottom line - the power saving between having frequency scaling enabled or not was surprisingly small (only 2-3W). It would appear that these processors are already fairly efficient at idle and scaling down the frequency adds little to the overall savings that may be obtained.
I disagree about the reason. I think they are actually not so efficient. At least not if I compare to a low-voltage CPU. 105 W is a lot, latest AMD quad core low-voltage are at 50W. Did you check core temperature in the two scaling states? It makes a huge difference for me on the AMD (which is allowed to drop from 2500 to 1000). It drops from an already low value (30 and 22 Celsius) by more than 10 degrees. The second core always shows the lowest temperature (puzzle?) and it goes down to 6-8 (!) Celsius in idle state with 1000.) I think this will also result on some more substantial savings in Watt consumption. Even, if not, a substantially lower temperature like this is good for a long life of all parts, anyway.
I see no difference on temps reported by coretemp for cpuspeed enabled/disabled. I *do* see a huge drop in temps between load and idle regardless of cpuspeed.
I read that thread and am puzzled by acpi-cpufreq being loaded on your machine. If I modprobe it I get an error "device busy". Which makes sense to me as cpufreq_ondemand (which loaded automatically) should have already taken over. I see that behavior on all machines, no matter if Intel or AMD.
From my research yesterday it also looks like use of acpi-cpufreq is
somewhat "older" and should not be necessary at all for newer CPUs. So, it should be cpufreq_ondemand alone that does the scaling on your machine. Can you confirm that?
I'm not sure of the function of acpi-cpufreq. I do know that it doesn't scale back *without* cpufreq_ondemand (cpuspeed). acpi-cpufreq was autoloaded in response to enabling C1E and EIST features in the BIOS (which one is responsible I don't know as I enabled both together).
I also wonder if your machine actually scales up. You listed the output in low/idle state. As I wrote I get the same, just at another level (they probably think Xeon's will be active all the time, anyway, so they allow them to drop not so much). Did you check that the frequency actually goes up to 2400 under load?
Yes, the frequency does scale up under load. I tested by launching a scientific app that loads all 4 cores at 100%. As fast as I could manually start the app and check the freq, it reported at 2.4GHz. I don't know at what point or under what load it will scale back up, and if scaling is done on a core by core basis, but it does scale back up under full load.
Ned Slider wrote on Sun, 03 Aug 2008 16:16:25 +0100:
acpi-cpufreq was autoloaded in response to enabling C1E and EIST features in the BIOS (which one is responsible I don't know as I enabled both together).
Ah, it must have been enabled by C1E. I don't know if I have that or can enable in BIOS. But it should have nothing to do with the scaling.
Yes, the frequency does scale up under load. I tested by launching a scientific app that loads all 4 cores at 100%. As fast as I could manually start the app and check the freq, it reported at 2.4GHz. I don't know at what point or under what load it will scale back up, and if scaling is done on a core by core basis, but it does scale back up under full load.
Thanks for confirmation. I tested now with two and four gzip parallel. That sends the CPU to 0% idle, but doesn't change the behavior. The quad core Xeon shows running at 2000, although it should go up to 2500. And the dual core shows 2333 for current frequency and 2000 for current scaling frequency - which can't be right. If I recall right, with the original CentOS Xen hypervisor kernel it showed both with 2000 all the time. I had not noticed that the dual core machine had the Xen 3.2 kernel installed, but was not booting with it. I just changed that. So, at least part of the behavior is Xen-kernel related.
5 minutes later: oh, yes, it does! Now I got it to 0% idle and current frequency jumped to 2333000 (although current scaling frequency was still shown at 2000000, on AMDs both figures rise).Looks like a clear bug in the centrino kernel module to me. It scales only up if the overall threshold is reached
Kai
Kai Schaetzl wrote on Sun, 03 Aug 2008 17:59:49 +0200:
5 minutes later: oh, yes, it does! Now I got it to 0% idle and current frequency jumped to 2333000 (although current scaling frequency was still shown at 2000000, on AMDs both figures rise).Looks like a clear bug in the centrino kernel module to me. It scales only up if the overall threshold is reached
Setting the threshold in sysconfig/cpuspeed from 80 to 50 down makes it work, e.g. a single gzip task filling one CPU will be able to scale the frequency up.
Kai
Kai Schaetzl wrote on Sun, 03 Aug 2008 16:57:20 +0200:
I disagree about the reason. I think they are actually not so efficient. At least not if I compare to a low-voltage CPU.
Just checked how much that AMD 4850e CPU drains under various conditions. There are *huge* differences. I checked whole power consumption of the machine. I don't know what "at wall" means. Did you measure the power consumption of the cpu alone or does "at wall" mean the same as I did?
Here are the figures, considering this is for the whole machine I think it's quite good.
idle: 1000 MHz: 76W 2500 MHz: 98W
1 core under load: 110W 2 core under load: 120W
So, that's not just the processor, it's the whole machine. It takes into account the powerdrain from the processor plus (probably) faster fans plus any other drain from memory/chipset that may be higher underload.
Kai
Kai Schaetzl wrote:
Kai Schaetzl wrote on Sun, 03 Aug 2008 16:57:20 +0200:
I disagree about the reason. I think they are actually not so efficient. At least not if I compare to a low-voltage CPU.
Just checked how much that AMD 4850e CPU drains under various conditions. There are *huge* differences. I checked whole power consumption of the machine. I don't know what "at wall" means. Did you measure the power consumption of the cpu alone or does "at wall" mean the same as I did?
Yes, sounds like you did the same as I did. I meant I plugged one of those watt meters into the power outlet at the wall and plugged the machine into that, so you're measuring the current draw "at the wall" or outlet. What this doesn't do is take into account how efficient (or inefficient) your power supply may be - if it's drawing 100W from the wall and is 80% efficient, then your system is only actually pulling 80W, the other 20W is heat dissipated from the PSU.
Here are the figures, considering this is for the whole machine I think it's quite good.
idle: 1000 MHz: 76W 2500 MHz: 98W
That's a nice little saving! Like I said previously, I only saw 2-3W saving at idle between full clock rate(2400MHz; 107-8W) and with freq scaling active (1600MHz; 105W) which would maybe imply that my system already has efficient halt state, and that throttling back (freq scaling) gives little further gains. Obviously that's not the case with your system.
Were you able to observe any drops in VCore voltage between load, idle (2500MHz) and 1000MHz with lm_sensors?
1 core under load: 110W 2 core under load: 120W
So, that's not just the processor, it's the whole machine. It takes into account the powerdrain from the processor plus (probably) faster fans plus any other drain from memory/chipset that may be higher underload.
Likewise :)
Ned Slider wrote on Mon, 04 Aug 2008 14:51:41 +0100:
Were you able to observe any drops in VCore voltage between load, idle (2500MHz) and 1000MHz with lm_sensors?
I can't get any other sensor data than the core temperatures.
Kai
Kai Schaetzl wrote on Sun, 03 Aug 2008 14:31:19 +0200:
I have a somewhat related question. That very new AMD CPU mentioned above was not recognized by CentOS 5.2 and the current frequency was shown as 800000 (instead of 2500000), although it was running in full speed. The latest kernel corrected this.
Actually, not the latest kernel. The CentOS xen boot (hypervisor) kernel /xen.gz-2.6.18-92.1.6.el5 (and maybe earlier) ones calculates the frequency correct, the Xen 3.2 boot kernel (xen.gz-3.2) from the Xen 3.2 package offered at xen.org does not. Might there be a kernel parameter or other measure that could correct this? (I don't want to recompile any kernels.) I wondered if it also the source of the inability to scale up on demand with the Intel CPUs and the 2000000 reported is actually wrong. But there is no change between the two boot kernels.
Kai
Kai Schaetzl wrote on Sun, 03 Aug 2008 16:31:19 +0200:
Actually, not the latest kernel. The CentOS xen boot (hypervisor) kernel /xen.gz-2.6.18-92.1.6.el5 (and maybe earlier) ones calculates the frequency correct, the Xen 3.2 boot kernel (xen.gz-3.2) from the Xen 3.2 package offered at xen.org does not. Might there be a kernel parameter or other measure that could correct this?
After some more research I have found the correct incantation for this. The CentOS/RH kernels seem to have this enabled by default, the kernels from xen.org have to be "enabled" with a command-line option to the hypervisor- kernel (not the CentOS kernel). kernel /xen.gz-3.2 cpufreq=dom0-kernel
I'm getting now correct readings of the frequencies. And scaling up on demand works. At least in dom0. I'm not so sure if it works for domUs as well. A quick test showed no ondemand scaling in dom0 when running a cpu-intensive task in a domU. Anyone has more experience with this?
Kai
On Sunday, August 03, 2008 at 8:31 AM, Kai Schaetzl wrote:
...I have an older low-voltage AMD CPU (probably about 2 years on the market) that is recognized as X2 3800+ but frequency scaling fails because it miscalculates the current speed to 800 MHz as well. Is there anything I can do about that? Where could I check whether this CPU should be supported in full and frequency scaling working?
The cpuspeed changelog may be relevant:
[quote] * Thu Mar 06 2008 Jarod Wilson jwilson@redhat.com
- Disable freq scaling by default on AMD rev F and earlier cpus when running xen, due to clock instability (#435321) [/quote]
I didn't look up your cpu, but I think it's a revision F.
Also, thanks for the /etc/sysconfig/cpuspeed "ondemand" tip.
It seemed counterintuitive to explicitly specify the so-called default governor value (i.e., "empty defaults to ondemand"), but doing so did the trick under xen for my revision G AMD processor.
Steve
S.Tindall wrote on Sun, 3 Aug 2008 21:47:06 -0400:
The cpuspeed changelog may be relevant:
[quote]
- Thu Mar 06 2008 Jarod Wilson jwilson@redhat.com
- Disable freq scaling by default on AMD rev F and earlier cpus
when running xen, due to clock instability (#435321) [/quote]
Thanks, it didn't occur to me that cpuspeed may also be relevant to this. However, I don't think it's relevant for the wrong cpu frequency reading on the 3.2 Xen kernels (which in turn is responsible for the missing scalability). Cpuspeed is not part of the kernel and did not change during all my tests. See below for possible explanation.
I didn't look up your cpu, but I think it's a revision F.
Hm, /proc/cpuinfo doesn't show any revision number. A bit googling tells me that the CPUs, at least the second one, are more likely to be rev. H or above. The older one is a 3800+ EE and the newer one is a 4850e which I bought right after it became available. Unless rev. G and up are only quad core CPUs at least the latter 45nm one should be rev G or up, too. But I can't find a definitive list, shouldn't there be one on the AMD site?
I saw postings about time problems on the xen-devel list, but these seemed to be more general and not restricted to older revisions of the AMD cpus. Now, after enabling frequency scaling on both I see that as well ("Warning Timer ISR/1: Time went backwards:"), but only on the newer CPU. It happens each time the frequency changes. It's possible it doesn't happen on the other (older!) cpu because there wasn't demand for a change yet, it's not doing any cpu intensive tasks.
Also, thanks for the /etc/sysconfig/cpuspeed "ondemand" tip.
It seemed counterintuitive to explicitly specify the so-called default governor value (i.e., "empty defaults to ondemand"), but doing so did the trick under xen for my revision G AMD processor.
I think what happens with cpuspeed is that the "Thu Mar 06 2008" patch mentioned above comes into play here. I assume they simply do not modprobe cpufreq_ondemand and thus it is not available and cannot be used as a default. All the other governors are available, no matter if cpuspeed is running or not, and ondemand is the only one that does real scaling. So, they simply disabled this and it gets only loaded once you force it (good, that they didn't disable that either). ondemand is missing on *both* kernels (the 3.2 one from Xen and the stock CentOS kernel) by default and adding the commandline I mentioned doesn't change this. However, it makes a difference on the Xen 3.2 (and Xen 3.2.1) kernel, as it corrects the frequency reading. The time warning only seems to occur on the Xen 3.2 kernels and not on the CentOS Xen kernels. At least I don't remember having it seen earlier. Which Xen kernel are you running? Hm, it occurs to me now that the older cpu where the time warning doesn't appear runs already on Xen 3.2.1 which may already have some patch to avoid this bug. Or it simply doesn't report it anymore :-) (It doesn't seem to be of any harm, not even dovecot - which is paranoid about time and has it's own time warning routinge - barks. But it spoils my monitoring :-()
Kai
Kai Schaetzl wrote:
S.Tindall wrote on Sun, 3 Aug 2008 21:47:06 -0400:
The cpuspeed changelog may be relevant:
[quote]
- Thu Mar 06 2008 Jarod Wilson jwilson@redhat.com
- Disable freq scaling by default on AMD rev F and earlier cpus
when running xen, due to clock instability (#435321) [/quote]
Thanks, it didn't occur to me that cpuspeed may also be relevant to this. However, I don't think it's relevant for the wrong cpu frequency reading on the 3.2 Xen kernels (which in turn is responsible for the missing scalability). Cpuspeed is not part of the kernel and did not change during all my tests. See below for possible explanation.
I didn't look up your cpu, but I think it's a revision F.
Hm, /proc/cpuinfo doesn't show any revision number. A bit googling tells me that the CPUs, at least the second one, are more likely to be rev. H or above. The older one is a 3800+ EE and the newer one is a 4850e which I bought right after it became available. Unless rev. G and up are only quad core CPUs at least the latter 45nm one should be rev G or up, too. But I can't find a definitive list, shouldn't there be one on the AMD site?
Maybe this is relevant to you:
http://www.centos.org/modules/newbb/viewtopic.php?topic_id=15328&forum=4...
quote # Frequency scaling on AMD rev F CPUs under Xen can result in # timekeeping problems for fully virtualized guests, so we disable # it by default. if [ -d /proc/xen ] && [ "$cpu_vendor" == AuthenticAMD ] \ && [ "$cpu_family" -le 15 ]; then default_governor=performance fi /quote
Ned Slider wrote on Mon, 04 Aug 2008 10:34:57 +0100:
quote # Frequency scaling on AMD rev F CPUs under Xen can result in # timekeeping problems for fully virtualized guests, so we disable # it by default. if [ -d /proc/xen ] && [ "$cpu_vendor" == AuthenticAMD ] \ && [ "$cpu_family" -le 15 ]; then default_governor=performance fi /quote
That's the patch mentioned by Steve. I didn't look in the cpuspeed init file earlier, I did now - it's inside it. As I assumed it doesn't load ondemand. Both my cpus are "family 15", so this applies. However, I'm not running fully virtualized guests. Nevertheless, the time warning occurs, not only in domU, but also in dom0. However, I think it's harmless. Time is absolutely stable in dom0 and domU. Maybe that may be change under more load. Maybe it's less harmless in fully virtualized machines.
It's a pity that so few information is available.
Kai
Kai Schaetzl wrote on Mon, 04 Aug 2008 11:08:55 +0200:
Hm, it occurs to me now that the older cpu where the time warning doesn't appear runs already on Xen 3.2.1 which may already have some patch to avoid this bug. Or it simply doesn't report it anymore :-)
The warning is gone after upgrading that machine to 3.2.1 as well, indeed.
Kai
On Monday, August 04, 2008 at 5:08 AM, Kai Schaetzl wrote:
...The older one is a 3800+ EE and the newer one is a 4850e which I bought right after it became available. Unless rev. G and up are only quad core CPUs at least the latter 45nm one should be rev G or up, too. But I can't find a definitive list, shouldn't there be one on the AMD site?
You can look up the processor revision/stepping here:
http://products.amd.com/en-us/DesktopCPUFilter.aspx
You can search by model number, etc. on the left side or more specifically by the cpu OPN (e.g., ADO4600CUBOX) on the right side.
If you select steppings G1/G2 and then pull down the model list, you can see the range of processors in each stepping. I don't think any of them are quad/triple cores.
Searching the "3800+" and knowing it is 65 watts (EE) shows either an F2 or F3. Likewise, the "4850e" is a G2.
Checking the cpuinfo on systems using a 4850e (G2) and a 4600+ EE (F2) both give "cpu family: 15", so they have included the Gs in the excluded group, too.
Steve
S.Tindall wrote on Mon, 4 Aug 2008 13:08:03 -0400:
Thanks, nice tool!
Checking the cpuinfo on systems using a 4850e (G2) and a 4600+ EE (F2) both give "cpu family: 15", so they have included the Gs in the excluded group, too.
Yeah. There's apparently no way to distinguish by stepping, so they used the family. It's a shame that most people will not know it and thus not get the benefits of frequency scaling.
Kai