At 13:49 -0400 2/10/07, Ross S. W. Walker wrote:
Sounds like the issue is more of a CPU issue then a disk issue, so just upgrading the hardware and OS should make a big difference in itself,
Yeah, that was the plan :-) Basically, we worked out what we needed to do (alleviate peak load CPU bottleneck by upgrading hardware), sought what we imagined would be suitable (dual faster CPU, hardware RAID 1, lots of RAM), and then ran into a brick wall with disk performance while testing - something that's never been an issue to date on the existing webservers which have a single IDE disk each.
but I would profile the SQL queries to make sure they are not trying to bite off more then they need to.
Fair point - we've done a lot of database tuning in the 5 years this app's been under development, so that's pretty well covered. With the existing hardware, (the back-end dbserver's a 1GB 1.6GHz P4 with mdadm RAID 1) the dbserver load barely reaches 1 even under peak traffic - we're not SQL- or IO-bound, we're CPU-bound on the front end.
Well when you created the file system the write cache wasn't installed yet right?
True, but there have been many wipes and installs since the BBUs have been available and the same long pauses when the inodes are created (much more noticeable with CentOS 4.5 than 5, but then the default nr_requests is 128 in 5 rather than 8192 in 4.5) that initially drew my attention are still apparent.
And it may be that when you were creating the file system it was right after you created the RAID1 array and the controller may have been still sync'ing up the disks, which will slow things down tremendously.
I noted that from the documentation at the outset and did an initial verify of the RAID 1 through the 3ware BIOS before doing the original install. A previous life as a technical author makes me a bit of a RTFM freak :-)
I agree that it is the edge cases that can come back and bite you just be sure you don't over-scope those edge cases for situations that will never arise.
That's why I'm now building the machine as if there wasn't an issue, so I can hammer it with apachebench and see if I'm tilting at windmills or not.
S.