kswapd taking 100% cpu with no swap on system

List overview All Threads
Download

newer

older

Recent kmod-kvm update errors

CentOs 5.6 and Time Sync

Ali Ahsan

7 May 2011 7 May '11

9:35 p.m.

Hi All

I have xeon server with 16 Gb Ram and no Swap memory.I am running cassandra server on two node in cluster.When there is high load on server kswapd0 kicks inn and take 100% cpu and make machine very slow and we need to restart out cassandra server.I have latest kernel 2.6.18-238.9.1.el5.Please let me know how can i fix this issue .Its hurting us badly this our production server any quick help will be appreciated.

-- S.Ali Ahsan

Show replies by date

Peter Kjellström

9 May 9 May

3:01 p.m.

On Saturday, May 07, 2011 09:35:48 PM Ali Ahsan wrote:

...

Hi All

I have xeon server with 16 Gb Ram and no Swap memory.I am running cassandra server on two node in cluster.When there is high load on server kswapd0 kicks inn and take 100% cpu and make machine very slow and we need to restart out cassandra server.I have latest kernel 2.6.18-238.9.1.el5.Please let me know how can i fix this issue .Its hurting us badly this our production server any quick help will be appreciated.

There is more than one bug that causes this behaviour. A few related memory managent situations (possibly responsible) may actually be avoided if you add some swap (even if it's not used). My suggestion would be to add some swap, set swappiness to 0 and see what happens.

/Peter

Ali Ahsan

3:07 p.m.

Thanks for reply,I dont know it has some thing to do with linux scheduler but some on suggested about linux scheduler cat /sys/block/sda/queue/scheduler there are quite few of them noop [anticipatory] deadline cfq.I have changed from cfq default to anticipatory,Now problem is less for me kswapd0 take 10-50% of cpu in very short bust less then milliseconds.Also i have changed /proc/sys/vm/pagecache from 100 to 10. Any more suggestion will help me make server better,As for i have read the best scheduler for databases are deadline like for mysql it worked great.

On 05/09/2011 06:01 PM, Peter Kjellström wrote:

...

On Saturday, May 07, 2011 09:35:48 PM Ali Ahsan wrote:

...
Hi All

I have xeon server with 16 Gb Ram and no Swap memory.I am running cassandra server on two node in cluster.When there is high load on server kswapd0 kicks inn and take 100% cpu and make machine very slow and we need to restart out cassandra server.I have latest kernel 2.6.18-238.9.1.el5.Please let me know how can i fix this issue .Its hurting us badly this our production server any quick help will be appreciated.

There is more than one bug that causes this behaviour. A few related memory managent situations (possibly responsible) may actually be avoided if you add some swap (even if it's not used). My suggestion would be to add some swap, set swappiness to 0 and see what happens.

/Peter

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

-- S.Ali Ahsan Senior System Engineer e-Business (Pvt) Ltd 49-C Jail Road, Lahore, P.O. Box 676 Lahore 54000, Pakistan Tel: +92 (0)42 3758 7140 Ext. 128 Mobile: +92 (0)345 831 8769 Fax: +92 (0)42 3758 0027 Email: ali.ahsan@panasiangroup.com www.ebusiness-pg.com www.panasiangroup.com Confidentiality: This e-mail and any attachments may be confidential and/or privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person use it for any purpose or store or copy the information in any medium. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. We do not accept liability for any errors or omissions.

Peter Kjellström

5:09 p.m.

Please don't top-post it's hard to follow.

On Monday, May 09, 2011 03:07:57 PM Ali Ahsan wrote:

...

Thanks for reply,I dont know it has some thing to do with linux scheduler but some on suggested about linux scheduler cat /sys/block/sda/queue/scheduler there are quite few of them noop [anticipatory] deadline cfq.I have changed from cfq default to anticipatory,Now problem is less for me kswapd0 take 10-50% of cpu in very short bust less then milliseconds.Also i have changed /proc/sys/vm/pagecache from 100 to 10.

This means that you're limiting the amount of RAM that can be used as pagecache (basically caching I/O) to 10%. This may not be an issue for you but may also severely limit your performance (all depending on work load).

An alternative may be to simple set swappiness to 0.

...

Any more suggestion will help me make server better,As for i have read the best scheduler for databases are deadline like for mysql it worked great.

You'll really have to try them to know what works for you. The two main variables that'll affect the choice is your I/O load and the device under (type of raid for example). For example some raid controllers really prefer noop.

Lastly, you didn't tell us if you're on i386-pae or x86_64 which may make a big difference.

/Peter

...

On 05/09/2011 06:01 PM, Peter Kjellström wrote:

...
On Saturday, May 07, 2011 09:35:48 PM Ali Ahsan wrote:

...
Hi All

I have xeon server with 16 Gb Ram and no Swap memory.I am running cassandra server on two node in cluster.When there is high load on server kswapd0 kicks inn and take 100% cpu and make machine very slow and we need to restart out cassandra server.I have latest kernel 2.6.18-238.9.1.el5.Please let me know how can i fix this issue .Its hurting us badly this our production server any quick help will be appreciated.

There is more than one bug that causes this behaviour. A few related memory managent situations (possibly responsible) may actually be avoided if you add some swap (even if it's not used). My suggestion would be to add some swap, set swappiness to 0 and see what happens.

/Peter

Ali Ahsan

5:49 p.m.

On 05/09/2011 08:09 PM, Peter Kjellström wrote:

...

This means that you're limiting the amount of RAM that can be used as pagecache (basically caching I/O) to 10%. This may not be an issue for you but may also severely limit your performance (all depending on work load).

An alternative may be to simple set swappiness to 0.

Hmmm nice points,I am using Sata with LVM with 1 TB of two HD.

...

You'll really have to try them to know what works for you. The two main variables that'll affect the choice is your I/O load and the device under (type of raid for example). For example some raid controllers really prefer noop.

Lastly, you didn't tell us if you're on i386-pae or x86_64 which may make a big difference.

System is x86_64

Lamar Owen

6:45 p.m.

On Monday, May 09, 2011 11:49:26 AM Ali Ahsan wrote:

...

Hmmm nice points,I am using Sata with LVM with 1 TB of two HD.

What sort of SATA drives are you using? There are some known issues with some SATA drives in certain configurations and on some controllers. It shouldn't cause kswapd to hit high CPU, but it is worth checking out.

Ali Ahsan

7:08 p.m.

On 05/09/2011 09:45 PM, Lamar Owen wrote:

...

What sort of SATA drives are you using? There are some known issues with some SATA drives in certain configurations and on some controllers. It shouldn't cause kswapd to hit high CPU, but it is worth checking out.

these modules are loaded when i install centos 5.5.

lsmod | egrep 'ata_piix|libata' ata_piix 57285 3 libata 208977 1 ata_piix scsi_mod 199001 5 scsi_dh,sg,usb_storage,libata,sd_mod

Ali Ahsan

7:11 p.m.

On 05/09/2011 10:08 PM, Ali Ahsan wrote:

...

On 05/09/2011 09:45 PM, Lamar Owen wrote:

...
What sort of SATA drives are you using? There are some known issues with some SATA drives in certain configurations and on some controllers. It shouldn't cause kswapd to hit high CPU, but it is worth checking out.

these modules are loaded when i install centos 5.5.

lsmod | egrep 'ata_piix|libata' ata_piix 57285 3 libata 208977 1 ata_piix scsi_mod 199001 5 scsi_dh,sg,usb_storage,libata,sd_mod

More info

SCSI subsystem initialized libata version 3.00 loaded. ata_piix 0000:00:1f.2: version 2.12 ACPI: PCI Interrupt 0000:00:1f.2[A] -> GSI 19 (level, low) -> IRQ 74 ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ] PCI: Setting latency timer of device 0000:00:1f.2 to 64 scsi0 : ata_piix scsi1 : ata_piix ata1: SATA max UDMA/133 cmd 0x3158 ctl 0x316c bmdma 0x3130 irq 74 ata2: SATA max UDMA/133 cmd 0x3150 ctl 0x3168 bmdma 0x3138 irq 74 ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-8: WDC WD10EARS-003BB1, 80.00A80, max UDMA/133 ata1.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata1.00: configured for UDMA/133 ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2.00: ATA-8: WDC WD10EARS-003BB1, 80.00A80, max UDMA/133 ata2.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata2.00: configured for UDMA/133 Vendor: ATA Model: WDC WD10EARS-003 Rev: 80.0 Type: Direct-Access ANSI SCSI revision: 05 SCSI device sda: 1953525168 512-byte hdwr sectors (1000205 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back SCSI device sda: 1953525168 512-byte hdwr sectors (1000205 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back SCSI device sda: 1953525168 512-byte hdwr sectors (1000205 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back sda: sda1 sda2 sd 0:0:0:0: Attached scsi disk sda Vendor: ATA Model: WDC WD10EARS-003 Rev: 80.0 Type: Direct-Access ANSI SCSI revision: 05

Ali Ahsan

7:31 p.m.

On 05/09/2011 10:11 PM, Ali Ahsan wrote:

...

...
...
What sort of SATA drives are you using? There are some known issues with some SATA drives in certain configurations and on some controllers. It shouldn't cause kswapd to hit high CPU, but it is worth checking out.

Actually i dont have any issue with read and write,If you have experience with Cassandra NoSQL database,Its very read and write intensive.It will approximately doing 10 Million concurrent read and writes.AS per Cassandra documentation "/Cassandra can write 50GB of data in 0.12 milliseconds, more than 2,500 times faster than MySQL./" http://news.ycombinator.com/vote?for=683885&dir=up&whence=%69%74%65%6d%3f%69%64%3d%36%38%33%38%38%35

Brunner, Brian T.

7:46 p.m.

...

Actually i dont have any issue with read and write,If you have experience with Cassandra NoSQL database,Its very read and write intensive.It will approximately doing 10 Million concurrent read and writes.AS per Cassandra documentation "Cassandra can write 50GB of data in 0.12 milliseconds, more than 2,500 times faster than MySQL."

50GB in 0.12 ms is 500TB in 1.2 second. SATA III buses at 6GB/sec need a "parallelness" of (nearly) 100 channels to soak up that data. How many drives must be written-to in parallel to sustain that write rate is ... Beyond my math skills.

Advertizing, or inadvertizing, has fudged a few numbers here. ******************************************************************* This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote also confirms that this email message has been swept for the presence of computer viruses. www.Hubbell.com - Hubbell Incorporated**

Ali Ahsan

7:52 p.m.

On 05/09/2011 10:46 PM, Brunner, Brian T. wrote:

...

50GB in 0.12 ms is 500TB in 1.2 second. SATA III buses at 6GB/sec need a "parallelness" of (nearly) 100 channels to soak up that data. How many drives must be written-to in parallel to sustain that write rate is ... Beyond my math skills.

Its only about Cassandra,Not about my system :).SO whats your suggestion about SATA driver ?

Brunner, Brian T.

8:03 p.m.

...

-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Ali Ahsan Sent: Monday, May 09, 2011 1:52 PM To: centos@centos.org Subject: Re: [CentOS] kswapd taking 100% cpu with no swap on system

On 05/09/2011 10:46 PM, Brunner, Brian T. wrote:

...
50GB in 0.12 ms is 500TB in 1.2 second. SATA III buses at 6GB/sec need a "parallelness" of (nearly) 100 channels to soak up that data. How many drives must be written-to in parallel to sustain that write rate is ... Beyond my math skills.

Its only about Cassandra, not about my system :). SO whats your suggestion about SATA driver ?

My suggestion: That (first) you check your facts. What substantiates a claim of a 50GB/.12ms write-rate? What environment? The claim is (AIUI) utterly misleading. That (IMHO) gross over-statement of Cassandra's write capacity makes any study of your current system a pursuit after feral aquatic fowl.

I believe you were after an explanation as to why your kswapd was running 100%? Lets stick to that and not drag in Cassandra vs MySQL.

Insert spiffy .sig here: Life is complex: it has both real and imaginary parts.

//me

******************************************************************* This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote also confirms that this email message has been swept for the presence of computer viruses. www.Hubbell.com - Hubbell Incorporated**

Lamar Owen

8:24 p.m.

On Monday, May 09, 2011 02:03:54 PM Brunner, Brian T. wrote:

...

...a pursuit after feral aquatic fowl.

For those for whom English is not their first language, this translates to 'wild goose chase' which term see in wikipedia.org.

Lamar Owen

7:51 p.m.

On Monday, May 09, 2011 01:11:17 PM Ali Ahsan wrote:

...

sd 0:0:0:0: Attached scsi disk sda Vendor: ATA Model: WDC WD10EARS-003 Rev: 80.0 Type: Direct-Access ANSI SCSI revision: 05

Are your two drives in a RAID? cat /proc/mdstat

The WD10EARS drives are known to have some issues, having to do with both powersave and TLER (google for WDTLER), as well as 4k sector size. I have seen these issues, but that was with Fedora, and it showed up as high system load, but not as kswapd at a high percentage. Also google for WD10EARS linux percentage; the top result on that page is a Western Digital community forum post titled 'WD10EARS slow, slow, slow, slow - Western Digital Community'

If you install the sysstat package, you can use iostat to see if iowaits are your problem; I've used 'iostat -x 1' (and then set my konsole to wider.....) and monitor the await column for each device; pinned down a WD15EADS drive creating iowaits....

Hope that helps.

Ali Ahsan

7:57 p.m.

On 05/09/2011 10:51 PM, Lamar Owen wrote:

...

On Monday, May 09, 2011 01:11:17 PM Ali Ahsan wrote:

...
sd 0:0:0:0: Attached scsi disk sda Vendor: ATA Model: WDC WD10EARS-003 Rev: 80.0 Type: Direct-Access ANSI SCSI revision: 05

Are your two drives in a RAID? cat /proc/mdstat

No i am doing LVM with 2X 1 TB HD

Lamar Owen

8:06 p.m.

On Monday, May 09, 2011 01:57:54 PM Ali Ahsan wrote:

...

On 05/09/2011 10:51 PM, Lamar Owen wrote:

...
On Monday, May 09, 2011 01:11:17 PM Ali Ahsan wrote:

...
sd 0:0:0:0: Attached scsi disk sda Vendor: ATA Model: WDC WD10EARS-003 Rev: 80.0 Type: Direct-Access ANSI SCSI revision: 05

Are your two drives in a RAID? cat /proc/mdstat

No i am doing LVM with 2X 1 TB HD

Can you give the output of pvdisplay, vgdisplay, and lvdisplay?

Also, did you align the pv's to 4K sectors when you partitioned?

What does iostat -x tell you?

The particular drives (hardware) you are using have known performance issues; there are a number of reports in Western Digital's forums confirming this, for more than just Linux.

Ali Ahsan

8:12 p.m.

On 05/09/2011 11:06 PM, Lamar Owen wrote:

...

Can you give the output of pvdisplay, vgdisplay, and lvdisplay?

Also, did you align the pv's to 4K sectors when you partitioned?

What does iostat -x tell you?

pvdisplay --- Physical volume --- PV Name /dev/sda2 VG Name VolGroup00 PV Size 931.01 GB / not usable 13.34 MB Allocatable yes (but full) PE Size (KByte) 32768 Total PE 29792 Free PE 0 Allocated PE 29792 PV UUID eLxFxo-4jHx-RGBA-mAR2-7vnW-YAej-3vPphH

--- Physical volume --- PV Name /dev/sdb1 VG Name VolGroup00 PV Size 931.51 GB / not usable 11.19 MB Allocatable yes PE Size (KByte) 32768 Total PE 29808 Free PE 120 Allocated PE 29688 PV UUID pMtvzy-je8m-uJfY-zEpS-M2Cd-ATW9-o9tfuM

vgdisplay --- Volume group --- VG Name VolGroup00 System ID Format lvm2 Metadata Areas 2 Metadata Sequence No 6 VG Access read/write VG Status resizable MAX LV 0 Cur LV 3 Open LV 3 Max PV 0 Cur PV 2 Act PV 2 VG Size 1.82 TB PE Size 32.00 MB Total PE 59600 Alloc PE / Size 59480 / 1.82 TB Free PE / Size 120 / 3.75 GB VG UUID FNqyvW-LWqe-iuV4-1tRJ-Gsrl-sH1X-KyHLpz

--- Logical volume --- LV Name /dev/VolGroup00/LogVol01 VG Name VolGroup00 LV UUID Ekyo0u-UVOe-Sanz-9Eil-N7J5-vebi-KcRUW2 LV Write Access read/write LV Status available # open 1 LV Size 30.00 GB Current LE 960 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:0

--- Logical volume --- LV Name /dev/VolGroup00/LogVol02 VG Name VolGroup00 LV UUID 2I10tF-PgZP-zUSc-oQi4-am6h-18xv-FfLUJw LV Write Access read/write LV Status available # open 1 LV Size 1.79 TB Current LE 58512 Segments 2 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:1

--- Logical volume --- LV Name /dev/VolGroup00/LogVol03 VG Name VolGroup00 LV UUID o9SQVk-Ehee-0PS3-e0FC-mb0h-vXwG-wCGvbp LV Write Access read/write Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:0

--- Logical volume --- LV Name /dev/VolGroup00/LogVol03 VG Name VolGroup00 LV UUID o9SQVk-Ehee-0PS3-e0FC-mb0h-vXwG-wCGvbp LV Write Access read/write LV Status available # open 1 LV Size 256.00 MB Current LE 8 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:2

[root@cassandra1 ~]# iostat -x Linux 2.6.18-238.9.1.el5 (cassandra1.xxxxx.com) 05/09/2011

avg-cpu: %user %nice %system %iowait %steal %idle 5.33 0.01 1.00 3.88 0.00 89.78

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.03 4.57 0.02 0.79 5.05 42.88 58.66 0.01 7.61 2.04 0.17 sda1 0.01 0.00 0.00 0.00 0.01 0.00 20.89 0.00 2.89 2.84 0.00 sda2 0.02 4.57 0.02 0.79 5.04 42.88 58.69 0.01 7.61 2.04 0.17 sdb 8.28 192.14 43.04 3.85 2855.18 1567.95 94.34 2.22 47.25 4.74 22.22 sdb1 8.28 192.14 43.04 3.85 2855.18 1567.95 94.34 2.22 47.25 4.74 22.22 dm-0 0.00 0.00 0.55 0.59 13.88 4.76 16.23 0.04 34.44 7.02 0.81 dm-1 0.00 0.00 46.71 197.04 2812.84 1576.32 18.01 5.23 21.44 0.89 21.61 dm-2 0.00 0.00 4.19 3.72 33.50 29.75 8.00 1.10 139.24 1.72 1.36

Lamar Owen

8:14 p.m.

On Monday, May 09, 2011 02:06:54 PM Lamar Owen wrote:

...

The particular drives (hardware) you are using have known performance issues; there are a number of reports in Western Digital's forums confirming this, for more than just Linux.

For reference: http://community.wdc.com/t5/Desktop/WD10EARS-slow-slow-slow-slow/td-p/7581 http://www.hv23.net/2010/02/wd10ears-performance-larger-block-size-issues4k/ http://b1mmer.com/linux/wdhdd/

Ali Ahsan

8:02 p.m.

On 05/09/2011 10:51 PM, Lamar Owen wrote:

...

iostat -x 1

I am little new to iostat please guide me on this

This is iostat output

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

avg-cpu: %user %nice %system %iowait %steal %idle 1.00 0.00 1.25 7.10 0.00 90.66

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 26.00 0.00 2056.00 0.00 79.08 2.14 33.23 9.08 23.60 sdb1 0.00 0.00 26.00 0.00 2056.00 0.00 79.08 2.14 33.23 9.08 23.60 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 36.00 0.00 2784.00 0.00 77.33 2.14 23.97 6.56 23.60 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

avg-cpu: %user %nice %system %iowait %steal %idle 34.79 0.00 1.25 6.11 0.00 57.86

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 14.00 0.00 49.00 0.00 2768.00 0.00 56.49 2.63 79.80 6.98 34.20 sdb1 14.00 0.00 49.00 0.00 2768.00 0.00 56.49 2.63 79.80 6.98 34.20 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 53.00 0.00 2040.00 0.00 38.49 4.63 111.42 6.45 34.20 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Lamar Owen

8:21 p.m.

On Monday, May 09, 2011 02:02:08 PM Ali Ahsan wrote:

...

On 05/09/2011 10:51 PM, Lamar Owen wrote:

...
iostat -x 1

I am little new to iostat please guide me on this

[snip]

...

avg-cpu: %user %nice %system %iowait %steal %idle 34.79 0.00 1.25 6.11 0.00 57.86

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 14.00 0.00 49.00 0.00 2768.00 0.00 56.49 2.63 79.80 6.98 34.20 sdb1 14.00 0.00 49.00 0.00 2768.00 0.00 56.49 2.63 79.80 6.98 34.20 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 53.00 0.00 2040.00 0.00 38.49 4.63 111.42 6.45 34.20

Ok, on this particular frame of data, you have a 6% iowait (which isn't bad for a busy server), and the awaits for sdb at 79.8 milliseconds, and a device mapper (this is the LVM piece) await of 111.42 milliseconds, aren't too terrible. That's slower than some busy servers I've seen.

In my case with the WD15EADS drive (in the same family as the WD10EARS drive), I had seen awaits in the 27,000 millisecond (27 seconds!) range during intensive operations; intensive io operations like an svn update on a copy of the Plone collective, which is a really good stress test if you want to bring a box to its knees, would take ten to fifteen times longer than they should have taken.

Watch the output, in particular the await column (you'll want to widen your terminal to get it all on single lines), for 'spikes' to see if this is the issue that is affecting you.

And it may not be the problem; but, then again, on my box with the WD15EADS drive, it would run for hours and then slow to an absolute crawl for minutes at a time, and then run smoothly for hours again.

Ali Ahsan

8:37 p.m.

Things i have done to solve this issue is

set swappiness to 0 and i have changed scheduler to anticipatory for better performance .I don't see kswpad process taking 100% CPU now.

Ali Ahsan

12 May 12 May

4:40 p.m.

I Spoke too early its not fixed.Any one have other options

On 05/09/2011 11:37 PM, Ali Ahsan wrote:

...

Things i have done to solve this issue is

set swappiness to 0 and i have changed scheduler to anticipatory for better performance .I don't see kswpad process taking 100% CPU now.

-- S.Ali Ahsan

Simon Matter

7:26 p.m.

...

I Spoke too early its not fixed.Any one have other options

On 05/09/2011 11:37 PM, Ali Ahsan wrote:

...
Things i have done to solve this issue is

set swappiness to 0 and i have changed scheduler to anticipatory for better performance .I don't see kswpad process taking 100% CPU now.

We had a number of issues on a bigger box which was running 32bit PAE kernel for historical reason. The box has then been migrated to 64bit and then transferred to new hardware (HP DL585G7, 48 AMD cores and 64G ram). The config which now works very well for us is:

- stock CentOS 5.x config - 16G swap configured (but 0 used in 99.9% if time) - following additional settings in /etc/sysctl.conf %< # Controls how aggressive the kernel will swap memory pages vm.swappiness = 10

# Controls the tendency to reclaim the memory which is used for caching vm.vfs_cache_pressure = 1 %<

Just in case to want to try something.

Simon

5159

Age (days ago)

5164

Last active (days ago)

discuss@lists.centos.org

22 comments

5 participants

tags (0)

participants (5)

Ali Ahsan
Brunner, Brian T.
Lamar Owen
Peter Kjellström
Simon Matter