William L. Maltby wrote:
On Thu, 2006-06-08 at 11:33 -0400, Sam Drinkard wrote:
Jim Perrin wrote:
<snip>
Knowledge = power, so pursue that. In the meantime, some things that *used* to be good "Rules of Thumb" (if not sitting on it) that you might be alert for as you investigate. Unfortunately, some would demand that you have a test bed to be sure you can 1) recover if necessary and 2) see if it really works without killing the end-user attitude and (potentially) your future (although anyone from the 8088 days... ;-
Definately..
- Many DBMS claim (and rightfully so) big performance gains if they are
on a "raw partition" rather than residing in a file system. If it's a whole disk, you won't even need to partition it, Linux and at least one other *IX support operations without such bothersome things.
- If data is read predominately sequentially and you are in a FS, make
a huge block size when you make the file system. This has more effect if the number crunching is input-data-intensive. Concomitantly, HDs with large cache will contribute substantially to reduced wait time. As you might surmise, all forms of cache are a total waste if the data (whether key or base) is totally wrongly sequenced.
Block size might be an option for sure. The machine is defaulted right now, but it would not be difficult to back stuff, let the disk get scribbled to and increase block size, then restore. Unfortunately, this is nowhere like a database, at least from what I know about how it works, so doubt if DBMS type stuff would apply.
- Ditto if random reads, but tend to be heavily grouped in consecutive
key ranges. In order for this to be effective, data should occur in most frequently accessed order and, optimally, the index for that sequence should be smack in the middle of the data blocks (i.e. first on disk is appx 50% of the data, then the index with some slack for growth and then the rest of the data). Better is index on another disk on another IDE (or whatever) channel). Can everybody say "Bus Mastering"? It's hard to keep things organized this way on a single disk, but since you're doing a batch operation, maybe you can make a copy as needed (or there is an HD backup frequently made?) and operate on that.
There again, it would take more knowledge about the model than I have available to know what goes on.
Anecdotally demonstrating the effectiveness of the statements about ordering matching application, in 1988(?) a n00b admin on Oracle couldn't understand why my app took 1:15 minutes to gen a screen. I *knew* it wasn't my program, I'm a performance enthusiast. We talked a bit, I told him to reload his DB in a certain sequence.
Result full screen in about 7 seconds.
To maintain that performance does require periodic reload/reorg.
- Defrag the appropriate parts occasionally. Whether a file system or
raw partition, stuff must occasionally go into "overflow areas". Unless your DBMS reorgs/defrags on-the-fly (bad for user response complaints), the easiest/fastest/cheapest way is cross-HD copies/reloads. After each cross-HD, scripts direct the apps to the right disk (on start up check a timestamp and file/part name gened by the reorg process). This only works if you have a quiescent time.
Not a lot of free time on here. As said before, about 4 hours of idle time per 24 hours.
- As Jim (IIRC) mentioned, avoiding access time updates can provide
gains. If you are fortunate enough that no writes need to be done to the whole partion when you a crunching the chunks (if partition has only the DB, if not, consider making it so Mr. Spock), remount the partion read only (mount -o remount,ro <partition-or-label-or-even-mount-point>) for the duration of the run. This benefits in some VM/kernel internal ways, to a very small degree too.
Have implemented the noatime, and should know something more about whether it helps in about 2 hours when this run completes.
There's more, but some may apply to only specific scenarios.
Yep.. maybe some of the folks at the WRF will respond to the tuning stuff too, or give me some more in depth info about what takes place.
Sam