I have an email server running Exim, Dovecot, Spamassassin, Clam, etc. on Centos 4.x 32bit. On occasion I have disk I/O problems. Its handling several domains and alot of email. Its currently on a single SATA drive. I am thinking of moving too 3 drives with RAID 1 for redundancy. RAID 1 will help me on reads but do nothing on writes as I understand. I am thinking the majority of my I/O is read though not? I imagine quotta checks and all that being done and everytime a user checks there email every message in the inbox must be read.
I guess I am asking if RAID 1 will help my I/O problem much?
[root@server ~]# w 12:04:02 up 2:01, 1 user, load average: 7.02, 7.47, 11.84 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/0 208.92.169.4.ppp 11:25 0.00s 0.02s 0.00s w [root@server ~]# vmstat procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 2 0 1558496 456916 1087224 0 0 198 749 795 537 18 4 27 50
The above is when its running pretty good.
Matt
Matt wrote:
I have an email server running Exim, Dovecot, Spamassassin, Clam, etc. on Centos 4.x 32bit. On occasion I have disk I/O problems. Its handling several domains and alot of email. Its currently on a single SATA drive. I am thinking of moving too 3 drives with RAID 1 for redundancy. RAID 1 will help me on reads but do nothing on writes as I understand. I am thinking the majority of my I/O is read though not? I imagine quotta checks and all that being done and everytime a user checks there email every message in the inbox must be read.
I guess I am asking if RAID 1 will help my I/O problem much?
[root@server ~]# w 12:04:02 up 2:01, 1 user, load average: 7.02, 7.47, 11.84 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/0 208.92.169.4.ppp 11:25 0.00s 0.02s 0.00s w [root@server ~]# vmstat procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 2 0 1558496 456916 1087224 0 0 198 749 795 537 18 4 27 50
The above is when its running pretty good.
can you paste the output of `iostat -x 5 5` while its busy ? this will show definateively how busy your disks are...
the first sample from vmstat, iostat, etc only shows the AVERAGE since the system booted. the 2nd and beyond samples are the average over the time intervals specified (5 5 means 5 seconds, 5 samples)
oh, if you don't have iostat, its part of package sysstat, so `yum install sysstat`
I have an email server running Exim, Dovecot, Spamassassin, Clam, etc. on Centos 4.x 32bit. On occasion I have disk I/O problems. Its handling several domains and alot of email. Its currently on a single SATA drive. I am thinking of moving too 3 drives with RAID 1 for redundancy. RAID 1 will help me on reads but do nothing on writes as I understand. I am thinking the majority of my I/O is read though not? I imagine quotta checks and all that being done and everytime a user checks there email every message in the inbox must be read.
I guess I am asking if RAID 1 will help my I/O problem much?
[root@server ~]# w 12:04:02 up 2:01, 1 user, load average: 7.02, 7.47, 11.84 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/0 208.92.169.4.ppp 11:25 0.00s 0.02s 0.00s w [root@server ~]# vmstat procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 2 0 1558496 456916 1087224 0 0 198 749 795 537 18 4 27 50
The above is when its running pretty good.
can you paste the output of `iostat -x 5 5` while its busy ? this will show definateively how busy your disks are... the first sample from vmstat, iostat, etc only shows the AVERAGE since the system booted. the 2nd and beyond samples are the average over the time intervals specified (5 5 means 5 seconds, 5 samples)
oh, if you don't have iostat, its part of package sysstat, so `yum install sysstat`
Right now its running pretty good but here it is.
[root@server ~]# w 13:11:02 up 3:08, 2 users, load average: 4.03, 5.71, 5.51
avg-cpu: %user %nice %sys %iowait %idle 2.80 0.00 1.60 58.10 37.50
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.60 142.80 67.20 170.20 678.40 2292.80 339.20 1146.40 12.52 118.53 615.66 4.21 99.92 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.60 142.80 67.20 170.20 678.40 2292.80 339.20 1146.40 12.52 118.53 615.66 4.21 99.92 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 67.40 286.20 678.40 2289.60 339.20 1144.80 8.39 163.02 582.40 2.83 99.94 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sar also.
10:02:56 AM LINUX RESTART
10:10:04 AM CPU %user %nice %system %iowait %idle 10:20:01 AM all 13.87 0.00 2.45 59.20 24.49 10:30:03 AM all 22.26 0.00 3.68 53.51 20.54 10:40:01 AM all 20.58 0.00 3.78 55.40 20.24 10:50:04 AM all 22.20 0.00 5.23 52.67 19.91 11:00:05 AM all 21.58 0.00 4.72 51.81 21.89 11:10:01 AM all 18.14 0.00 4.52 56.91 20.43 11:20:03 AM all 21.42 0.00 4.59 47.20 26.79 11:30:02 AM all 19.22 0.00 4.48 53.86 22.44 11:40:04 AM all 17.59 0.00 4.82 51.61 25.98 11:50:02 AM all 15.88 0.00 4.67 45.74 33.71 12:00:01 PM all 13.32 0.00 2.73 25.72 58.23 12:10:02 PM all 16.98 0.00 4.54 53.14 25.35 12:20:01 PM all 17.31 0.00 3.45 47.80 31.44 12:30:01 PM all 19.45 0.00 4.08 36.47 40.00 12:40:01 PM all 13.79 0.00 4.39 44.83 36.99 12:50:01 PM all 12.18 0.00 3.93 30.16 53.73 01:00:01 PM all 11.53 0.00 2.38 20.96 65.12 Average: all 17.49 0.00 4.03 46.29 32.19
A while after the reboot it straightened its self out. Yesterday "w" was indicatining load average of like 120 or more at times. Today after reboot all is good.
Matt
Matt wrote:
Right now its running pretty good but here it is. Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.60 142.80 67.20 170.20 678.40 2292.80 339.20 1146.40 12.52 118.53 615.66 4.21 99.92
You have about 3 times more writing going on than reading. RAID1 isn't going to do much for you. You might want to try four disks in a RAID10 instead. And put your mail folders and spool on partitions mounted using 'noatime'.
Matt wrote:
Right now its running pretty good but here it is.
10:10:04 AM CPU %user %nice %system %iowait %idle 10:20:01 AM all 13.87 0.00 2.45 59.20 24.49 10:30:03 AM all 22.26 0.00 3.68 53.51 20.54 10:40:01 AM all 20.58 0.00 3.78 55.40 20.24
[ Stuff Deleted ... ]
01:00:01 PM all 11.53 0.00 2.38 20.96 65.12 Average: all 17.49 0.00 4.03 46.29 32.19
A while after the reboot it straightened its self out. Yesterday "w" was indicatining load average of like 120 or more at times. Today after reboot all is good.
Matt
Hi Matt,
Your %iowait seems high.
I had %iowait comparable to you with a single 200 Gigs 7200 RPM IDE drive.
Now we've upgraded this server: Opteron 2216 with 4 Gigs RAM, CentOS 64 v5.2) and an Adaptec 3405 plus 4 x 73 Gigs Seagate 15K RPM (RAID 10). No more %iowait! I benched it at about 140 MB/s and random r/w are very good on these drives.
We have 55 employees running Outlook connecting to Communicate with MAPI. The dataset is about 100 Gigs. Communigate use MBOX files for storage.
Guy Boisvert, ing. IngTegration inc.
Guy Boisvert boisvert.guy@videotron.ca writes:
Your %iowait seems high.
I had %iowait comparable to you with a single 200 Gigs 7200 RPM IDE drive.
Now we've upgraded this server: Opteron 2216 with 4 Gigs RAM, CentOS 64 v5.2) and an Adaptec 3405 plus 4 x 73 Gigs Seagate 15K RPM (RAID 10). No more %iowait! I benched it at about 140 MB/s and random r/w are very good on these drives.
Yeah. without dangerous write-back caching, as far as performance (as opposed to space) goes, expensive SAS disks (or even the expensive 15K sata disks) will get you a lot more than any amount of RAID on the cheap 7500RPM sata disks.
Really, once you have the 15K disks, I'm sure a mirror would be fine, if 73G is enough.
The new (and really expensive) solid-state flash drives are also really good for reducing your iowait.
The other thing to check is vmstat- if you have lots of si and so, well, your disk is slow 'cause you are swapping a lot. add more ram and the problem goes away.
Matt schrieb:
I have an email server running Exim, Dovecot, Spamassassin, Clam, etc. on Centos 4.x 32bit. On occasion I have disk I/O problems. Its handling several domains and alot of email. Its currently on a single SATA drive. I am thinking of moving too 3 drives with RAID 1 for redundancy. RAID 1 will help me on reads but do nothing on writes as I understand. I am thinking the majority of my I/O is read though not? I imagine quotta checks and all that being done and everytime a user checks there email every message in the inbox must be read.
I guess I am asking if RAID 1 will help my I/O problem much?
[root@server ~]# w 12:04:02 up 2:01, 1 user, load average: 7.02, 7.47, 11.84 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/0 208.92.169.4.ppp 11:25 0.00s 0.02s 0.00s w [root@server ~]# vmstat procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 2 0 1558496 456916 1087224 0 0 198 749 795 537 18 4 27 50
The above is when its running pretty good.
Matt _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
How many concurrent users? If you stay with SATA, you will probably have to increase the amount of disks in you storage-array (depending on the number of users). A SATA disk does only 80 I/Os per second or so - and clamav+spamassassin alone will consume a lot of these. Try to put their working-directories on a swap-backed tmpfs.
cheers, Rainer
Matt wrote:
I have an email server running Exim, Dovecot, Spamassassin, Clam, etc.
Firstly, I'd drop dovecot completely, cyrus-imapd has, for me, been a lot faster and better optimised for situations where you have more than a few hand full of users.
Secondly, exim configs out of the box on CentOS are *really* not optimized for performance at all. If you handle more than a few thousand emails an hour, I dont be surprised if exim is to blame for jamming your i/o pipe just trying to work out which email to attempt delivery for next.
I have an email server running Exim, Dovecot, Spamassassin, Clam, etc.
Firstly, I'd drop dovecot completely, cyrus-imapd has, for me, been a lot faster and better optimised for situations where you have more than a few hand full of users.
Secondly, exim configs out of the box on CentOS are *really* not optimized for performance at all. If you handle more than a few thousand emails an hour, I dont be surprised if exim is to blame for jamming your i/o pipe just trying to work out which email to attempt delivery for next.
This is a Directadmin type server. So I really have no options on changing services. I am running exim version 4.60 and I wander if updating to 4.69 may help out.
Matt