Thanks, using directories as file names is a great idea, anyway I'm not sure if that would solve my performance issue, as the bottleneck is the disk and not mysql.
The situation you described initally, suffers from only one issue - too many files in one single directory. You are not the fists fighting this - see qmail maildir, see squid etc. The remedy is always one and the same - split the files into a tree folder structure. For a sample implementaition - check out squid, backup pc etc ...
I just implemented the directories names based on the hash of the file and the performance is a bit slower than before. This is the output of atop (15 secs. avg.):
PRC | sys 0.53s | user 5.43s | #proc 112 | #zombie 0 | #exit 0 | CPU | sys 4% | user 54% | irq 2% | idle 208% | wait 131% | cpu | sys 1% | user 24% | irq 1% | idle 54% | cpu001 w 20% | cpu | sys 2% | user 15% | irq 1% | idle 31% | cpu002 w 52% | cpu | sys 1% | user 8% | irq 0% | idle 52% | cpu003 w 38% | cpu | sys 1% | user 7% | irq 0% | idle 71% | cpu000 w 21% | CPL | avg1 10.58 | avg5 6.92 | avg15 4.66 | csw 19112 | intr 19135 | MEM | tot 2.0G | free 49.8M | cache 157.4M | buff 116.8M | slab 122.7M | SWP | tot 1.9G | free 1.2G | | vmcom 2.2G | vmlim 2.9G |
I am under the impression that you are swapping. Out of 2GB of cache, you have just 157MB cache and 116MB buffers. What is eating the RAM? Why do you have 0.8GB swap used? You need more memory for file system cache.
PAG | scan 1536 | stall 0 | | swin 9 | swout 0 | DSK | sdb | busy 91% | read 884 | write 524 | avio 6 ms | DSK | sda | busy 12% | read 201 | write 340 | avio 2 ms | NET | transport | tcpi 8551 | tcpo 8204 | udpi 702 | udpo 718 | NET | network | ipi 9264 | ipo 8946 | ipfrw 0 | deliv 9264 | NET | eth0 5% | pcki 6859 | pcko 6541 | si 5526 Kbps | so 466 Kbps | NET | lo ---- | pcki 2405 | pcko 2405 | si 397 Kbps | so 397 Kbps |
in sdb is the cache and in sda is all other stuff, including the mysql db files. Check that I have a lot of disk reads in sdb, but I'm really getting one file from disk for each 10 written, so my guess is that all other reads are directory listings. As I'm using the hash as directory names, (I think) this makes the linux cache slower, as the files are distributed in a more homogeneous and randomly way among the directories.
I think that linux file system cache is smart enough for this type of load. How many files per directory do you have?
The app is running a bit slower than using the file name for directory name, although I expect (not really sure) that it will be better as the number of files on disk grows (currently there are only 600k files from 15M). My current performance is around 50 file i/o per second.
Something is wrong. Got to figure this out. Where did this RAM go?