Hiya,
Currently running Centos 4.2 x86_64 dist on a dual 3G xeon, 2G ram, scsi setupand everythings been running fine on it for some time. Then at 4am last night something kicked in (have mrtg running monitoring when) and since then its been running a load of about 1.5 (normally around 0.4). CPU usage is Cpu(s): 1.1% us, 0.6% sy, 0.0% ni, 97.9% id, 0.2% wa, 0.1% hi, 0.1%si.
Can't see any new processes that would cause the load, just wondering is there any way to try and track down whats actually causing this? It's not excessive load, but want to add some new services and wary now, its something that seems wrong given the sudden increase at 4am (think thats when some o.s housekeeping tasks are normally scheduled, but there's none running that I can see that started today).
Just hoping someone may have some tips on checking whats always waiting or how to isolate whats happening. As said, ps -ef shows no new processes, and cpu usage is very low.
Tia, Ian
On 21/06/06, Ian mu mu.llamas@gmail.com wrote:
Just hoping someone may have some tips on checking whats always waiting or how to isolate whats happening. As said, ps -ef shows no new processes, and cpu usage is very low.
top vmstat 5
... would be two good places to start. 4am sounds like about the time jobs from /etc/cron.daily would kick off.
[root@willspc ~]# grep cron.daily /etc/crontab 02 4 * * * root run-parts /etc/cron.daily
Have a look and see what's in cron.daily. On a recently built minimal install I have...
00-logwatch 0anacron prelink slocate.cron yum.cron 00-makewhatis.cron logrotate rpm tmpwatch
It could just be updatedb building your slocate database? (Check /etc/updatedb.conf for DAILY_UPDATE= ).
Will.
Ian mu wrote:
Hiya,
Currently running Centos 4.2 x86_64 dist on a dual 3G xeon, 2G ram, scsi setupand everythings been running fine on it for some time. Then at 4am last night something kicked in (have mrtg running monitoring when) and since then its been running a load of about 1.5 (normally around 0.4). CPU usage is Cpu(s): 1.1% us, 0.6% sy, 0.0% ni, 97.9% id, 0.2% wa, 0.1% hi, 0.1% si.
Can't see any new processes that would cause the load, just wondering is there any way to try and track down whats actually causing this? It's not excessive load, but want to add some new services and wary now, its something that seems wrong given the sudden increase at 4am (think thats when some o.s housekeeping tasks are normally scheduled, but there's none running that I can see that started today).
Just hoping someone may have some tips on checking whats always waiting or how to isolate whats happening. As said, ps -ef shows no new processes, and cpu usage is very low.
Have you been up to date with patches? Have you tried running rkhunter and chkrootkit to see if you've been burgled? One of the first things a rootkit does is replace things like ps so it's processes become "invisible."
Cheers,
Hiya, thanks for the replies, very useful and has given me some food for thought on a few things.
Used rkhunter which is fine apart from one app out of date which I've now updated, chkrootkit its clear but chkproc gives a couple of processes not in readdir output, but they correspond to apps we are running when I check in /proc/pid/cmdline so think that sides looking ok (still checking a couple of bits though).
The strange one was on the vmstat 5 suggestion, the r (waiting for runtime) column is pretty much 0, if the load is > 1 shouldn't that be mostly > 1 also, or am I misunderstanding the load definition?
I.e currently load is 1.98
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 624 34652 66608 1059564 0 0 1 9 0 0 3 1 96 0 0 0 624 34436 66608 1059564 0 0 0 39 1207 2534 1 1 97 0 0 0 624 34268 66608 1059564 0 0 0 42 1202 2412 1 1 98 0 0 0 624 34140 66608 1059564 0 0 0 33 1197 2427 1 1 98 0 0 0 624 34140 66608 1059564 0 0 0 0 1196 2427 1 1 98 0 0 0 624 34188 66608 1059632 0 0 0 37 1205 2545 2 1 97 0 1 0 624 34196 66608 1059632 0 0 0 0 1197 2392 1 1 98 0 0 0 624 34444 66608 1059632 0 0 0 33 1200 2430 1 1 98 0 0 0 624 34260 66608 1059632 0 0 0 0 1198 2441 1 1 98 0 0 0 624 34132 66608 1059632 0 0 0 37 1210 2592 1 1 97 0 0 0 624 34204 66608 1059632 0 0 0 34 1207 2502 1 1 98 0 0 0 624 34268 66608 1059632 0 0 0 33 1201 2433 1 1 98 0
Cheers, Ian
On 6/21/06, Chris Mauritz chrism@imntv.com wrote:
Ian mu wrote:
Hiya,
Currently running Centos 4.2 x86_64 dist on a dual 3G xeon, 2G ram, scsi setupand everythings been running fine on it for some time. Then at 4am last night something kicked in (have mrtg running monitoring when) and since then its been running a load of about 1.5 (normally around 0.4). CPU usage is Cpu(s): 1.1% us, 0.6% sy, 0.0% ni, 97.9% id, 0.2% wa, 0.1% hi, 0.1% si.
Can't see any new processes that would cause the load, just wondering is there any way to try and track down whats actually causing this? It's not excessive load, but want to add some new services and wary now, its something that seems wrong given the sudden increase at 4am (think thats when some o.s housekeeping tasks are normally scheduled, but there's none running that I can see that started today).
Just hoping someone may have some tips on checking whats always waiting or how to isolate whats happening. As said, ps -ef shows no new processes, and cpu usage is very low.
Have you been up to date with patches? Have you tried running rkhunter and chkrootkit to see if you've been burgled? One of the first things a rootkit does is replace things like ps so it's processes become "invisible."
Cheers,
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On 21/06/06, Ian mu mu.llamas@gmail.com wrote:
The strange one was on the vmstat 5 suggestion, the r (waiting for runtime) column is pretty much 0, if the load is > 1 shouldn't that be mostly > 1 also, or am I misunderstanding the load definition?
I.e currently load is 1.98
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 624 34652 66608 1059564 0 0 1 9 0 0 3 1 96 0 0 0 624 34436 66608 1059564 0 0 0 39 1207 2534 1 1 97 0 0 0 624 34268 66608 1059564 0 0 0 42 1202 2412 1 1 98 0
That looks like a lot of context switches to me, though I'm not sure if that's by virtue of the number or type of CPUs or the workload you're running.
The most consistently busy boxes I have here rarely see more than 1-300 context switches and sit at a load average of 0.50, 0.67, 0.61.
Will.
On 6/21/06, Ian mu mu.llamas@gmail.com wrote:
Used rkhunter which is fine apart from one app out of date which I've now updated, chkrootkit its clear but chkproc gives a couple of processes not in readdir output, but they correspond to apps we are running when I check in /proc/pid/cmdline so think that sides looking ok (still checking a couple of bits though).
Keep in mind that tools like this should be run from trusted media and not from the suspected machine. This ensures that there is no kernel-space nastiness intercepting calls and feeding you bad information, as well as the fact that you're working from known good binaries. The centos live cd would be good for this, as well as knoppix or others. It may be traitorous to say this, but there's a knoppix based distro out there for forensic/data-recovery use with rootkit hunting tools on it. I generally keep a copy of it lying around, although the name escapes me at present.
Jim Perrin spake the following on 6/21/2006 6:00 AM:
On 6/21/06, Ian mu mu.llamas@gmail.com wrote:
Used rkhunter which is fine apart from one app out of date which I've now updated, chkrootkit its clear but chkproc gives a couple of processes not in readdir output, but they correspond to apps we are running when I check in /proc/pid/cmdline so think that sides looking ok (still checking a couple of bits though).
Keep in mind that tools like this should be run from trusted media and not from the suspected machine. This ensures that there is no kernel-space nastiness intercepting calls and feeding you bad information, as well as the fact that you're working from known good binaries. The centos live cd would be good for this, as well as knoppix or others. It may be traitorous to say this, but there's a knoppix based distro out there for forensic/data-recovery use with rootkit hunting tools on it. I generally keep a copy of it lying around, although the name escapes me at present.
Is it knoppix-std?
On Wed, 2006-06-21 at 15:29 -0400, Jim Perrin wrote:
Is it knoppix-std?
It's the one with the penguin logo :-P
We have a CentOS LiveCD now :) ... it works well for CentOS related recovery issues and for looking for rootkits.
It has ckrootkit and virus checking capabilities. If there are other tools on knoppix that it doesn't have that people routinely use, let me know PLEASE :)
I would like to make the CentOS Live CD as capable as possible. It already looks like testdisk will be added soon.
Thanks, Johnny Hughes
Johnny Hughes wrote:
We have a CentOS LiveCD now :) ... it works well for CentOS related recovery issues and for looking for rootkits.
Wow somehow I completely missed this announcement, thanks for mentioning in this thread. Torrenting now... with all the CentOS boxen I run I'm sure it'll prove invaluable. Thanks for making it!
-te
On Wednesday 21 June 2006 12:17, Ian mu wrote:
Hiya,
Currently running Centos 4.2 x86_64 dist on a dual 3G xeon, 2G ram, scsi setupand everythings been running fine on it for some time. Then at 4am last night something kicked in (have mrtg running monitoring when) and since then its been running a load of about 1.5 (normally around 0.4). CPU usage is Cpu(s): 1.1% us, 0.6% sy, 0.0% ni, 97.9% id, 0.2% wa, 0.1% hi, 0.1%si.
Can't see any new processes that would cause the load, just wondering is there any way to try and track down whats actually causing this? It's not excessive load, but want to add some new services and wary now, its something that seems wrong given the sudden increase at 4am (think thats when some o.s housekeeping tasks are normally scheduled, but there's none running that I can see that started today).
Just hoping someone may have some tips on checking whats always waiting or how to isolate whats happening.
hit it with the big sledgehammer, oprofile :-) that will very likely tell you if the kernel is doing something (besides the expected mwait_idle-ish...). rough guide (not for cut-n-paste):
yum install oprofile kernel-smp-devel (and manual install of kernel-debuginfo) opcontrol --setup --vmlinux=...vmlinux from -debuginfo opcontrol --reset ; --start ; sleep ; --stop opreport -l -p /lib/modules/$uname -r | head
/Peter
As said, ps -ef shows no new processes, and cpu usage is very low.
Tia, Ian