2008/10/17 Mag Gam <magawake at gmail.com>: > Hi John: > > Well, we run a lot of statistical analysis and our code loads a lot of > data into a vector for fast calculations. I am not sure how else to do > these calculations fast without loading it into memory. Thats why we > have to do it this way. About 15 years ago I changed an application on SGI IRIX from using text files scanf(3)'ed into memory (with floating point numbers in them) to binary files mmap(2)'ed into memory. Processing time was cut down by over 95% and did much more in the 5% left (e.g. allow interactive real-time viewing of different "frames" of data). Using mmap'ed files means that the system will know that these pages are backed by blocks on the file system and therefore it won't take up so much "buffer" space which needs to be writen out into the swap partition whenever the memory buffer is needed for something else, only disk cache space which can be just freed if the buffer was only read. You can also benefit if multiple processes access same file - they'll share the buffer in memory too. It's not a silver bullet, there are still issues with too random access causing the system the thrash, but at least it won't take up so much swappable memory, it'll save lots of copying (file->kernel->user when reading and the other way around when writing), system calls etc. If you can process data in sequential order and possibly with help of madvise(2) you can probably squeeze out even more from this option. --Amos