Hi folks.
I'm posting this to both the Fedora as well as the CentOS lists in hopes that somewhere, someone can help me figure out what's going on. I have a dual Xeon 3GHz server that's performing rather slow when it comes to disk activities.
The machine is configured with a single 160 GiB OS drive (with CentOS 5.0) and 4x500 GiB drives setup in a RAID-5 configuration. All drives are setup for 3.0 GiB SATA link, and the motherboard also supports that. Looking in dmesg when the system comes up, I see that reflected as well:
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-7, max UDMA/133, 312581808 sectors: LBA48 NCQ (depth 0/32) ata1.00: configured for UDMA/133 scsi1 : ahci ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32) ata2.00: configured for UDMA/133 scsi2 : ahci ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata3.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32) ata3.00: configured for UDMA/133 scsi3 : ahci ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata4.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32) ata4.00: configured for UDMA/133 scsi4 : ahci ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata5.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32) ata5.00: configured for UDMA/133 scsi5 : ahci
Now, I don't know what performance numbers *should* be, but on a 1.8 GiB copy on the RAID (cp from one location to another on the RAID), it gets done in just under 50 seconds. If I try to delete the folder afterwards (rm -rf FOLDER) it takes a few seconds to do so, however if I delete the CONTENTS of the folder, it does so within a fraction of a second (but then 'sync' takes a few seconds to catch up.)
That same folder that I'm copying contains 452 jpeg files in it, ranging from 2.5 to 6.2 MiB. Doing some image processing on them is where it takes a long time. At the moment I'm doing a simple thumbnail creation with the ImageMagick suite (convert FILE -thumbnail "200x200>' `basename FILE .jpg`.th.jpg) and it takes upwards of 8 minutes to complete. The whole time it's running, 'top' reports the server load as follows: load average: 1.06, 1.00, 0.81 And the CPU usage is around 9%.
Interestingly, if I run the same command and have it create .png instead, it takes longer, but I won't go there just yet. My question is, is this the expected performance on something like this, or should I be able to get better results? Is there something I should or could do to speed up disk based processes?
Or is this something where it's more memory intensive and I need to look at adding more (right now it has 2 GiB of memory.)
This problem is causing one of our web sites to time out because it's trying to process hundreds of image files and generate thumbnails, and it's taking forever to do that. So I'm starting at the bottom of the pile here, hardware. If it turns out the hardware is fine, and there's nothing else that can be done to speed it up, then I'll move forward to other possible culprits, such as the routines within the site scripts themselves...
On Mon, April 30, 2007 1:58 pm, Ashley M. Kirchner wrote:
Hi folks. I'm posting this to both the Fedora as well as the CentOS lists in
hopes that somewhere, someone can help me figure out what's going on. I have a dual Xeon 3GHz server that's performing rather slow when it comes to disk activities.
The machine is configured with a single 160 GiB OS drive (with
CentOS 5.0) and 4x500 GiB drives setup in a RAID-5 configuration. All drives are setup for 3.0 GiB SATA link, and the motherboard also supports that. Looking in dmesg when the system comes up, I see that reflected as well:
Now, I don't know what performance numbers *should* be, but on a 1.8
GiB copy on the RAID (cp from one location to another on the RAID), it gets done in just under 50 seconds. If I try to delete the folder afterwards (rm -rf FOLDER) it takes a few seconds to do so, however if I delete the CONTENTS of the folder, it does so within a fraction of a second (but then 'sync' takes a few seconds to catch up.)
That same folder that I'm copying contains 452 jpeg files in it,
ranging from 2.5 to 6.2 MiB. Doing some image processing on them is where it takes a long time. At the moment I'm doing a simple thumbnail creation with the ImageMagick suite (convert FILE -thumbnail "200x200>' `basename FILE .jpg`.th.jpg) and it takes upwards of 8 minutes to complete. The whole time it's running, 'top' reports the server load as follows: load average: 1.06, 1.00, 0.81 And the CPU usage is around 9%.
Interestingly, if I run the same command and have it create .png
instead, it takes longer, but I won't go there just yet. My question is, is this the expected performance on something like this, or should I be able to get better results? Is there something I should or could do to speed up disk based processes?
Or is this something where it's more memory intensive and I need to
look at adding more (right now it has 2 GiB of memory.)
This problem is causing one of our web sites to time out because
it's trying to process hundreds of image files and generate thumbnails, and it's taking forever to do that. So I'm starting at the bottom of the pile here, hardware. If it turns out the hardware is fine, and there's nothing else that can be done to speed it up, then I'll move forward to other possible culprits, such as the routines within the site scripts themselves...
Try running "hdparm -Tt /dev/sda" replace /dev/sda with the device name for your hard disks. After running that post back here with the results.
You can also install sysstat and use the sar command to see if your system is indeed spening a lot of time waiting on disk reads and writes.
Try running "hdparm -Tt /dev/sda" replace /dev/sda with the device name for your hard disks. After running that post back here with the results.
You can also install sysstat and use the sar command to see if your system is indeed spening a lot of time waiting on disk reads and writes.
and for bonus points, from that same sysstat package...
iostat -x 5
during a highly disk intensive session. Ignore the first sample output, thats average since boot, the 3 means sample every 5 seconds. this prints a wealth of useful information on the state of disk IO on a per drive basis.
btw, a rather useful performance tool for linux systems -> http://www-128.ibm.com/developerworks/aix/library/au-analyze_aix/index.html
(originally for AIX, its been ported to Linux... catch-22, its output is best processed by an Excel spreadsheet to product graphs... but you can use it on its own.)
On 4/30/07, Ashley M. Kirchner ashley@pcraft.com wrote:
I'm posting this to both the Fedora as well as the CentOS lists in
hopes that somewhere, someone can help me figure out what's going on. I have a dual Xeon 3GHz server that's performing rather slow when it comes to disk activities.
The machine is configured with a single 160 GiB OS drive (with
CentOS 5.0) and 4x500 GiB drives setup in a RAID-5 configuration. All drives are setup for 3.0 GiB SATA link, and the motherboard also supports that. Looking in dmesg when the system comes up, I see that reflected as well:
What raid controller are you using? Have you read the disk tuning guide on the wiki (http://wiki.centos.org/HowTos/Disk_Optimization)? Also note that raid 5 is not the fastest option around. If you're really looking for speed, you should look at using raid 10.
Ashley M. Kirchner wrote:
Hi folks.
I'm posting this to both the Fedora as well as the CentOS lists in hopes that somewhere, someone can help me figure out what's going on. I have a dual Xeon 3GHz server that's performing rather slow when it comes to disk activities.
The machine is configured with a single 160 GiB OS drive (with CentOS 5.0) and 4x500 GiB drives setup in a RAID-5 configuration. All drives are setup for 3.0 GiB SATA link, and the motherboard also supports that. Looking in dmesg when the system comes up, I see that reflected as well:
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-7, max UDMA/133, 312581808 sectors: LBA48 NCQ (depth 0/32) ata1.00: configured for UDMA/133 scsi1 : ahci ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32) ata2.00: configured for UDMA/133 scsi2 : ahci ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata3.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32) ata3.00: configured for UDMA/133 scsi3 : ahci ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata4.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32) ata4.00: configured for UDMA/133 scsi4 : ahci ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata5.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32) ata5.00: configured for UDMA/133 scsi5 : ahci
Now, I don't know what performance numbers *should* be, but on a 1.8 GiB copy on the RAID (cp from one location to another on the RAID), it gets done in just under 50 seconds. If I try to delete the folder afterwards (rm -rf FOLDER) it takes a few seconds to do so, however if I delete the CONTENTS of the folder, it does so within a fraction of a second (but then 'sync' takes a few seconds to catch up.)
That same folder that I'm copying contains 452 jpeg files in it, ranging from 2.5 to 6.2 MiB. Doing some image processing on them is where it takes a long time. At the moment I'm doing a simple thumbnail creation with the ImageMagick suite (convert FILE -thumbnail "200x200>' `basename FILE .jpg`.th.jpg) and it takes upwards of 8 minutes to complete. The whole time it's running, 'top' reports the server load as follows: load average: 1.06, 1.00, 0.81 And the CPU usage is around 9%.
Interestingly, if I run the same command and have it create .png instead, it takes longer, but I won't go there just yet. My question is, is this the expected performance on something like this, or should I be able to get better results? Is there something I should or could do to speed up disk based processes?
Or is this something where it's more memory intensive and I need to look at adding more (right now it has 2 GiB of memory.)
This problem is causing one of our web sites to time out because it's trying to process hundreds of image files and generate thumbnails, and it's taking forever to do that. So I'm starting at the bottom of the pile here, hardware. If it turns out the hardware is fine, and there's nothing else that can be done to speed it up, then I'll move forward to other possible culprits, such as the routines within the site scripts themselves...
At first I would suggest against cross-posting. The two OSes are not exactly the same. Also you should specify your exact OS.
Regarding the 2 CPUs, do you have the service irqbalance running? Also if not running CentOS 5.0, have you installed the SMP kernel?
Regarding parallel processing, I do not know much about GNU/Linux parallelization, however I assume a given script is running on one CPU of the two.
Obviously there are others here with more insight about this than me.
Ioannis Vranos wrote:
At first I would suggest against cross-posting. The two OSes are not exactly the same. Also you should specify your exact OS.
I did in my original message: CentOS 5.0
Regarding the 2 CPUs, do you have the service irqbalance running? Also if not running CentOS 5.0, have you installed the SMP kernel?
irqbalance is running, so is an SMP kernel:
Linux bigbertha 2.6.18-8.1.1.el5 #1 SMP Mon Apr 9 09:46:54 EDT 2007 i686 i686 i386 GNU/Linux
The dual Xeon processors come up as 2 physical and 8 logical.
-- A
On Apr 30, 2007, at 12:48 PM, Ashley M. Kirchner wrote:
The dual Xeon processors come up as 2 physical and 8 logical.
I think your 9% CPU usage is misleading. It probably means you're using one out of eight processors for 72% of your eight minutes. Add in the iowait time (which should be similar to the time the copy took - 11% of your eight minutes) and you're not far off from explaining the whole time.
In top on Linux, you can hit 'I' to make the CPU percentages go up to 800% instead of 100%. Or '1' to show each processor individually.
You can make it go much faster by running more than one convert at a time. Each convert task is fully independent, so if this were 100% CPU usage, you could go almost eight times faster.
Scott Lamb wrote:
I think your 9% CPU usage is misleading. It probably means you're using one out of eight processors for 72% of your eight minutes. Add in the iowait time (which should be similar to the time the copy took
- 11% of your eight minutes) and you're not far off from explaining
the whole time.
In top on Linux, you can hit 'I' to make the CPU percentages go up to 800% instead of 100%. Or '1' to show each processor individually.
Well, that would make sense...
Cpu0 : 20.2%us, 7.1%sy, 0.0%ni, 72.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 1.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 0.0%us, 1.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 52.5%us, 13.9%sy, 0.0%ni, 28.7%id, 5.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 2074932k total, 745108k used, 1329824k free, 26328k buffers Swap: 2096440k total, 0k used, 2096440k free, 547884k cached
And iostat -x 5 shows:
avg-cpu: %user %nice %system %iowait %steal %idle 8.95 0.00 2.90 0.87 0.00 87.28
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 4.39 0.00 0.40 0.00 38.32 96.00 0.00 0.00 0.00 0.00 sdb 214.77 3.39 22.55 1.40 1898.60 44.71 81.13 0.05 2.03 1.71 4.09 sdc 207.58 6.19 18.96 2.00 1812.38 71.86 89.90 0.04 1.83 1.47 3.07 sdd 209.78 0.80 21.16 1.00 1847.50 20.76 84.32 0.05 2.31 1.70 3.77 sde 214.57 2.00 19.36 1.20 1871.46 31.94 92.58 0.03 1.60 1.52 3.13 md0 0.00 0.00 0.00 2.79 0.00 22.36 8.00 0.00 0.00 0.00 0.00 md9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md6 0.00 0.00 73.05 4.59 7356.49 36.73 95.22 0.00 0.00 0.00 0.00 md5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 73.05 7.39 7356.49 59.08 92.19 0.21 2.59 0.99 7.98 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
And when the process is done, some 8 and a half minutes later, iostat says:
avg-cpu: %user %nice %system %iowait %steal %idle 0.02 0.00 0.05 0.00 0.00 99.93
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 4.60 0.00 0.40 0.00 40.00 100.00 0.00 4.50 4.50 0.18 sdb 0.00 0.00 0.20 1.00 1.60 14.40 13.33 0.03 24.33 19.17 2.30 sdc 0.00 1.80 0.00 1.40 0.00 32.00 22.86 0.03 13.57 16.86 2.36 sdd 0.00 0.00 0.20 1.00 1.60 14.40 13.33 0.02 24.00 19.00 2.28 sde 0.00 1.80 0.00 1.40 0.00 33.60 24.00 0.04 17.71 21.00 2.94 md0 0.00 0.00 0.00 2.60 0.00 20.80 8.00 0.00 0.00 0.00 0.00 md9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 2.60 0.00 20.80 8.00 0.03 6.23 7.31 1.90 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
So, anyone going to make sense out of that please?
On Apr 30, 2007, at 2:33 PM, Ashley M. Kirchner wrote:
So, anyone going to make sense out of that please?
Sure. If you're doing this sequentially:
for file in *.jpg; do convert $file -thumbnail 200x200 > `basename $file .jpg`.th.jpg done
it's never going to be using more than one processor. Some of the time it's using processor 0, sometimes processor 5. (Keep in mind that those percentages are averaged over some unit of time. As one of my coworkers is fond of saying, at a given instant there's no such thing as a 50% busy CPU - it's either 100% or 0% busy.) The processor 0 and 5 user and system numbers add up to a bit less than 100%.
You could do something like this instead:
$ cat > GNUmakefile <<EOF SOURCES = $(wildcard *.jpg) THUMBNAILS = $(SOURCES:%.jpg=%.th.jpg)
.PHONY: thumbnails thumbnails: $(THUMBNAILS)
%.th.jpg: %.jpg convert $< -thumbnail 200x200 > $@ || (rm $@; false)
.PHONY: clean clean: rm -f *.th.jpg EOF
(note that those indentations have to be tabs, not spaces)
$ make clean $ time make -j1 $ make clean $ time make -j8
I'd expect the first timed make to take about eight minutes and the second timed make to take somewhere between one and two minutes.
Now, going back to your first post:
This problem is causing one of our web sites to time out because it's trying to process hundreds of image files and generate thumbnails, and it's taking forever to do that. So I'm starting at the bottom of the pile here, hardware. If it turns out the hardware is fine, and there's nothing else that can be done to speed it up, then I'll move forward to other possible culprits, such as the routines within the site scripts themselves...
You're not trying to generate thumbnails on every hit, are you? Even with all the might of eight processors and this RAID array, you won't get this task to complete in the .1 sec that usability experts say is the maximum acceptable page load time. You're going to have to precompute them. Or if you're doing it on upload and that's timing out, then redirect the browser to a progress bar page before converting or something.
Scott Lamb wrote:
I'd expect the first timed make to take about eight minutes and the second timed make to take somewhere between one and two minutes.
I'll try that in a minute here...
You're not trying to generate thumbnails on every hit, are you? Even with all the might of eight processors and this RAID array, you won't get this task to complete in the .1 sec that usability experts say is the maximum acceptable page load time. You're going to have to precompute them. Or if you're doing it on upload and that's timing out, then redirect the browser to a progress bar page before converting or something.
Actually, the way the vendor programmed it, we upload anywhere between 4-800 images, then when someone goes to view that page, that's when it creates the thumbs on the fly. Now, we're requested this process be changed so that the thumbnails are created in the background after uploading, so that when someone goes to view them, they're already created and just pulled from cache.
Funny how the vendor seemed to think that generating thumbnails on the fly, all 800 of them, would take, according to them, 'a fraction of second per image' - sure, when you have 3 images, but not when you have 800 images.
On Apr 30, 2007, at 2:56 PM, Ashley M. Kirchner wrote:
Actually, the way the vendor programmed it, we upload anywhere between 4-800 images, then when someone goes to view that page, that's when it creates the thumbs on the fly. Now, we're requested this process be changed so that the thumbnails are created in the background after uploading, so that when someone goes to view them, they're already created and just pulled from cache.
Funny how the vendor seemed to think that generating thumbnails on the fly, all 800 of them, would take, according to them, 'a fraction of second per image' - sure, when you have 3 images, but not when you have 800 images.
Ugh, yeah, that's...true if they generate the thumbnail on the actual hit for the thumbnail and they live in fantasy-land where only one person is browsing the site and that person's web browser asks for one image at a time. In reality-land, there will be multiple users and multiple simultaneous requests/user, so more outstanding requests than processors, so requests can get at best delayed an additional queue_size*per_image_time/n_processors, and more likely slowed even more than that due to thrashing. I could easily see hitting some timeout for individual GET requests as well as one for the total page.
In your position after hearing that crap I'd tell them to keep the new image set hidden from users until the thumbnails are generated, to provide a progress bar to the uploader, maybe some indication of pending images in the interface for other admins, and probably to provide pagination to limit the images/page to a more reasonable number. And if I'm feeling particularly unforgiving, to launch n_processors low-priority processes/threads to do the work so that it completes quickly if the server is idle yet doesn't penalize user requests going on at the same time.
(I'm not nice to vendors who say stupid things.)
Scott Lamb wrote:
$ cat > GNUmakefile <<EOF SOURCES = $(wildcard *.jpg) THUMBNAILS = $(SOURCES:%.jpg=%.th.jpg)
.PHONY: thumbnails thumbnails: $(THUMBNAILS)
%.th.jpg: %.jpg convert $< -thumbnail 200x200 > $@ || (rm $@; false)
.PHONY: clean clean: rm -f *.th.jpg EOF
(note that those indentations have to be tabs, not spaces)
Um, what shell are you working in? I can't hit TAB on mine because that wants to put a file name in place - it's bash' autocompletion tab...
Nevermind Scott, figured it out. Ended up doing it slightly different with the same result. And I had to "fix" the convert line, but it's running now, so we'll see what happens.
Scott Lamb wrote:
$ make clean $ time make -j1 $ make clean $ time make -j8
So, as you guessed, j1 took just under 10 minutes whereas j8 took under 2. Which is what I would expect. Okay, thanks for the help. It's time I go beat up our vendor now. As far as understand their explanation is that they simply call the thumbnail creation routine sequentially, one image at a time, which in turn goes to PHP which then goes to Apache. And as far as the OS is concerned, it received 1 single call from Apache to do something, as opposed to multiple calls at the same time.
Would that be a reasonable explanation (of why their process takes so long)? Because they're not spawning multiple calls ... ?
On Apr 30, 2007, at 3:31 PM, Ashley M. Kirchner wrote:
As far as understand their explanation is that they simply call the thumbnail creation routine sequentially, one image at a time, which in turn goes to PHP which then goes to Apache. And as far as the OS is concerned, it received 1 single call from Apache to do something, as opposed to multiple calls at the same time.
Would that be a reasonable explanation (of why their process takes so long)? Because they're not spawning multiple calls ... ?
Oh, I see. So they're actually generating all thumbnails as they generate the HTML that links to them? Yeah, that won't do. Not taking advantage of all the processors is the least of their problems, they also:
- have a single GET request which takes way, way, way too long to respond - n_unprocessed_thumbnails*time_per_thumbnail >>> 0.1 sec for n_unprocessed_thumbnails=800, even if there's no contention for the processor
- are forking from Apache, which is a bad idea, especially if it's multithreaded (are you using mpm_worker)? Big processes don't fork quickly. Much better to use a library to do this.
Scott Lamb wrote:
- have a single GET request which takes way, way, way too long to
respond - n_unprocessed_thumbnails*time_per_thumbnail >>> 0.1 sec for n_unprocessed_thumbnails=800, even if there's no contention for the processor
- are forking from Apache, which is a bad idea, especially if it's
multithreaded (are you using mpm_worker)? Big processes don't fork quickly. Much better to use a library to do this.
I think at this point, I may have to write my own routine and have them tie into it since having them re-code their interface for multi threading might not work so well.
And I'm running Apache 1.3 only because PHP doesn't take advantage of Apache 2.0 anyway - it'll always run at 1.3...
Scott Lamb wrote:
- have a single GET request which takes way, way, way too long to
respond - n_unprocessed_thumbnails*time_per_thumbnail >>> 0.1 sec for n_unprocessed_thumbnails=800, even if there's no contention for the processor
- are forking from Apache, which is a bad idea, especially if it's
multithreaded (are you using mpm_worker)? Big processes don't fork quickly. Much better to use a library to do this.
Hey Scott,
I did a test running convert, single processor and got the following timing:
ImageMagick 'convert', 1 process, JPEG (command line) real 8m35.515s user 6m18.674s sys 1m46.344s
Then I wrote a small PHP script that does the same thing (read image, resize with constrain) and ran it through the command line, and got this:
PHP, 1 process, JPEG (through PHP CLI) real 36m17.260s user 35m38.972s sys 0m30.651s
Why oh why is it so much slower?! Is there something inheritedly slow within PHP that causes it to be so much slower (which in turn causes the Apache process to also take an incredible amount of time to finish the same task)?
On Apr 30, 2007, at 5:07 PM, Ashley M. Kirchner wrote:
I did a test running convert, single processor and got the following timing:
ImageMagick 'convert', 1 process, JPEG (command line) real 8m35.515s user 6m18.674s sys 1m46.344s
Then I wrote a small PHP script that does the same thing (read image, resize with constrain) and ran it through the command line, and got this:
PHP, 1 process, JPEG (through PHP CLI) real 36m17.260s user 35m38.972s sys 0m30.651s
Why oh why is it so much slower?! Is there something inheritedly slow within PHP that causes it to be so much slower (which in turn causes the Apache process to also take an incredible amount of time to finish the same task)?
Interesting. I'm not sure, as I don't use PHP. You might have better luck getting an answer if you post your test scripts to a PHP list.
A few things I might do if I were trying to diagnose this:
- throw in a few getrusage()s to see which section of the script is using the most time - maybe I'd get lucky and see one obviously wrong - ltrace shows all calls to shared libraries - maybe it could uncover something weird - use whatever profiling tool is available for profiling PHP scripts. (In Python, I'm fond of hotshot.) - find out what underlying library they're using and try it from a different language (Python, C, whatever) to see if the library or PHP is the problem. - compile the PHP interpreter itself with profiling and run it through gprof to where the time went.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Mon, Apr 30, 2007 at 01:48:58PM -0600, Ashley M. Kirchner wrote:
Linux bigbertha 2.6.18-8.1.1.el5 #1 SMP Mon Apr 9 09:46:54 EDT 2007 i686 i686 i386 GNU/Linux
The dual Xeon processors come up as 2 physical and 8 logical.
Try disabling the logical processors (Hyperthreading). There is VERY good chance you will notice improved performance.
Tip of the day: Only use HT for workstation, or not even there.
[]s
- -- Rodrigo Barbosa "Quid quid Latine dictum sit, altum viditur" "Be excellent to each other ..." - Bill & Ted (Wyld Stallyns)