On Tue, 2006-06-06 at 14:46 -0400, Sam Drinkard wrote:
William L. Maltby wrote:
<snip>
After running sar, I'm not sure I'm confused as I was. I tend to think sar might be correct in what it records and sees. Looking at memory usage and swap, I find the system *is* using all of the available memory, especially at 0420, most likely when the updatedb and slocate crons run.<snip>
I'll suggest this as an interim workaround. Separate the times for any updatedb, slocate, makewhatis, ... by 10 or 15 minute intervals. The combination of reduced run time due to reduced contention might have them all done near the same ending time as before and may avoid going into swap (even if it's not a big slow down, why do it if no need?) and may very likely ameliorate any observable adverse effects.
If that works, add "nice" to them, cautiously if there are completion "deadlines", which will reduce the effects even further of the maintenance jobs. If there's no completion deadline of consequence a BIG nice value can be used.
If you haven't tried removing the readahead and readahead_early... I've not investigated, but I'm three days into my machine which had locked in the past and still showing only 160 swap use, via free. My thinking is: 1) Linux offers a way to lock pages in memory or processes (UNIX sticky bit) so they won't swap; 2) Readahead locks them (WARNING! No investigation here, pure conjecture); 3) Squeezes other things out of memory; 4) regardless, I disabled those startups on my *workstation* and see no change in behavior; 5) if that's the same for you, why do useless tasks?
<snip>
To make a long story short, the machine apparetly does need the swap at times, and while it's not using much during model run time, the cron jobs do turn it loose!
That's not surprising, given your references to a "weather model"? It's nice to know that the VM does most of what it's supposed to do *right*.
Sar can report *alot* of things, and I suspect I'll be using it more to see if there is anything I can do to tweak the thing to get a bit more performance out of it. Bad enough that it takes so long for my jobs to run, but the number crunching the model does IS pretty tough on things I suppose. Guess we've beat up the swap thing enough.. I'll try to learn more about how to use the additional info at my disposal now.
Another thing to look at. My last contract at IBM (2.2/2.4 days) I found a lot of kernel params that can be set, along the lines of swappiness, that can be used to help meet special needs. I had to use a few of them. Anyway, there may be some left in this kernel. If you get into the kernel docs, they're probably listed, even if not well publicized.
:-} Then you can publish them on this list and let me be lazy in my
(semi-)retirement!
Sam
<snip sig stuff>
HTH