[CentOS] High load averages with latest kernel and USB drives?

Tue Nov 17 03:56:49 UTC 2009
Benjamin Smith <lists at benjamindsmith.com>

I'm having a server report a high load average when backing up Postgres 
database files to an external USB drive. This is driving my loadbalancers all 
out of kilter and causing a large volume of network monitor alerts. 

I have a 1TB USB drive plugged into a USB2 port that I use to back up the 
production drives (which are SCSI). It's working fine, but while doing backups 
(hourly) the load average on the server shoots up from the normal 0.5 - 1.5 or 
so up to a high between 10 and 30. Strangely, even though the "load is high" 
the server is completely responsive, even the USB drives being accessed are! 

Backup script is really simple, run via cron, pretty much just: 

#! /bin/sh 
hour=`date +%k`;
pg_dump <options> mydatabase > /media/backups/mydatabase.$hour.pgsql; 

where /media/backups is the mount point for the USB drive. 

Using top to diagnose, nothing seems to be particularly high! IoWait seems 
reasonable (10-30%) and CPUs are 0.5%, Idle is 70-90%. Even accessing the USB 
partition while the load is "high" is responsive! 

I'm guessing that something changed in how load average is counted?

Server Stats: 
	Late model 8-way Xeon, SuperMicro brand. 
	CentOS 4.x  / 64 (all updates applied, booted after last kernel update) 
	Kernel 2.6.9-89.0.16.ELsmp
	4 GB ECC RAM
	300 GB SCSI HDD. 
	Standard Apache/PHP, Postgres 8.4. 

Any idea how to revert to the old load average tracking behavior short of 
using a stale and potentially insecure kernel? 

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.