hi all
I have something like 3000 log files in a directory (from data collection). At 4am in the morning cron runs a script of mine that essentiallly trims all files in that directory back to a certain size. Basically does a "tail -c XXXX filename > tmp_filename"
I have noticed that only around the 4am-4:15 time frame when the trim is happening does my other process log connection attempts but my process (forking and opening databases) is not responding in time to give data back to the connecting process. Not critically important as it just tries again and all ok...
My question is how do I tell the script that runs to run at a lower priority perhaps ??? Or is there another way to do this I am not aware of?
The trimmming of the log files is just so they they dont take a bunch of space on the HD. It doesnt have to be done supper fast or anything.
Any ideas I might try? Thanks,
Jerry
Hello Jerry,
On Mon, 2011-06-20 at 11:05 -0400, Jerry Geis wrote:
My question is how do I tell the script that runs to run at a lower priority perhaps ???
A similar issue came up just a few days ago.
$ man ionice
Regards, Leonard.
Jerry Geis wrote:
hi all
I have something like 3000 log files in a directory (from data collection). At 4am in the morning cron runs a script of mine that essentiallly trims all files in that directory back to a certain size. Basically does a "tail -c XXXX filename > tmp_filename"
I have noticed that only around the 4am-4:15 time frame when the trim is happening does my other process log connection attempts but my process (forking and opening databases) is not responding in time to give data back to the connecting process. Not critically important as it just tries again and all ok...
<snip> Are you trimming the logfiles sequentially, or in parallel? And unless the files are many megs each, 3k files isn't a huge number these days, so I'd wonder what else is running around then.
mark
On 6/20/2011 10:31 AM, Kai Schaetzl wrote:
Jerry Geis wrote on Mon, 20 Jun 2011 11:05:34 -0400:
The trimmming of the log files is just so they they dont take a bunch of space on the HD.
use logrotate and zip them.
Not sure how that helps with the issue of queuing up a whole lot of disk activity at once with a lot of locking operations (directory and free space updates). Can you split the files into some reasonable number of subdirectories and process one subdirectory at a time with a few seconds between to let the disk queue flush?
-- Les Mikesell lesmikesell@gmail.com
On 6/20/11, Jerry Geis geisj@pagestation.com wrote:
I have noticed that only around the 4am-4:15 time frame when the trim is happening does my other process log connection attempts but my process (forking and opening databases) is not responding in time to give data back to the connecting process. Not critically important as it just tries again and all ok...
I was having similar problems with a script crawling through thousands of small files. As suggested by others here, using ionice appeared to have made a world of difference.
On top of that, since it's your own script and since speed isn't critical, you could insert breaks after every x files to allow other processes time to run.