On Mon, 2012-02-13 at 17:06 -0500, Boris Epstein wrote:
On Mon, Feb 13, 2012 at 4:58 PM, Adam Tauno Williams <awilliam@whitemice.org
wrote: On Mon, 2012-02-13 at 16:45 -0500, Boris Epstein wrote:
Hello listmates, I got a rather strange situation that I can't quite make sense of. OK,
I've
got a very large data file to sort (hundreds of millions of lines) and I decided to use MySQL for the purpose. I inserted the stuff into a table easily enough. Then I decided to sort it and got stuck as it turned out that MySQL, unless specifically configured to do otherwise, puts
temporary
files in /tmp which simply was not sufficiently large. Then I changed
that
directory to a partition that had more space (let's call the new temp driectory /home/big-temp) and now as I am running the query aimed at sorting the data it seems like space, according to the df, is no longer being used up under / (which was there /tmp was) but is now being used up in the right partition ( /home, the large one). Yet /home/big-temp is
still
empty! So how is that possible?
Easy. It is using temporary files the *correct* way.
- Open file
- Unlink file
- Use file
- Close file
This means (a) even if the process abends the resources allocated to the file are released and (b) an external process can't see [or modify] the temporary file. When a file is unlinked it remains 'active' until all references to the file are released - but the daemon is still holding a reference [because it is using the file]. There is a file there, but nobody, not even root, can see it. Actually you can; if you look in /proc/{pid#}/fd ...
I haven't thought of it this way. In fact it matches with this description: http://dev.mysql.com/doc/refman/5.1/en/temporary-files.html I can kind of see the advantages; the disadvantages, of course, are that a major transaction can not be resumed in case the mysqld process is stopped.
If the process is stopped the transaction cannot be resumed for a myriad reasons; loss of the temporary file is a trivial concern.