Re: [CentOS] High load averages with latest kernel and USB drives?

17 Nov 2009


      Benjamin Smith wrote, On 11/17/2009 01:46 PM:
...
See comments below...
On Tuesday 17 November 2009 07:52:01 Todd Denniston wrote:
...
Benjamin Smith wrote, On 11/16/2009 10:56 PM:
...
I have a 1TB USB drive plugged into a USB2 port that I use to back up the
production drives (which are SCSI). It's working fine, but while doing
backups (hourly) the load average on the server shoots up from the normal
0.5 - 1.5 or so up to a high between 10 and 30. Strangely, even though
the "load is high" the server is completely responsive, even the USB
drives being accessed are!
Using top to diagnose, nothing seems to be particularly high! IoWait
seems reasonable (10-30%) and CPUs are 0.5%, Idle is 70-90%. Even
accessing the USB partition while the load is "high" is responsive!
you might add another field to top while you are watching, Last used cpu (SMP), i.e.,
start top
press f
press j
press enter
this should let you see if your process is bouncing between processors.
...
...
As workarounds perhaps asking the kernel to schedule in a specific way
 might help, i.e.: #1 set the backup on a particular set of processors,
#  replace the pg_dump line above with
taskset -c 3-4 pg_dump <options> mydatabase > \
   /media/backups/mydatabase.$hour.pgsql;
There are 8 cores on the machine, none of which are reporting more than 5% 
load. That's what has me perplexed. When I run top, I see a max of about 30% 
user. Everything else is zero. When I run the backup script to a non-USB 
drive, the load average is completely normal (below 0.50, often below 0.10)
USB chewing up more CPU than normal disks has been my experience all along, this just seems a little 
extreme.
...
...
#2 set the usb-storage on a particular set of processors,
# Note USBSTORPID= line prototyped on CentOS 5 machine not 4.
USBSTORPID=`ps aux |grep usb-storage|head -1 |awk '{print $2}'`
taskset -p -c 3-4 $USBSTORPID
#you might even go back and reduce the processor list
#to just 3 or 4 instead of both.
Could you explain to me what this should accomplish? I'm curious as to why you 
went this route...
Even though the process is not using much processor time, having it bounce around between processors 
can:
* thrash the cache of each processor as it goes there
* waste time context switching in the next processor
* bounce other processes around and cascade the same effects as they go along
I know that there has been some scheduler work over time to have these switches be less likely, but 
I have also seen some good effects by locking certain processes into a processor instead of letting 
it float.  Usually the best processes to do to are ones that use large amounts of memory, like X or 
Firefox which are large enough that they thoroughly toss anything else out of a processor's cache.
-- 
Todd Denniston
Crane Division, Naval Surface Warfare Center (NSWC Crane)
Harnessing the Power of Technology for the Warfighter

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [CentOS] High load averages with latest kernel and USB drives?