Hi
I have apache log file around 7.6G and record half year
Which program/command (perl, vi, or sed) is better to extract the data by date? and finally I can remove that big file and still keep the record.
I have problem when using vi and it uses up the server memory
Thank you for your help
__________________________________________________________________ Get a sneak peak at messages with a handy reading pane with All new Yahoo! Mail: http://ca.promos.yahoo.com/newmail/overview2/
On Fri, 2010-01-01 at 08:45 -0800, ann kok wrote:
Hi
I have apache log file around 7.6G and record half year
Which program/command (perl, vi, or sed) is better to extract the data by date?
Your question is a little too general. You want to extract what portion of the data, into what?
If you want a complete record, you apparently have one already. If you want to keep everything, just run it through bzip2 before storing it.
If you just want reports, look at something like webalizer.
If you want something "custom", you will have to decide exactly what you want. Perl or sed or grep could probably extract it, or you could make a program in python or C or whatever.
But the first task is to define exactly what you want.
ann kok wrote:
Hi
I have apache log file around 7.6G and record half year
Which program/command (perl, vi, or sed) is better to extract the data by date? and finally I can remove that big file and still keep the record.
I have problem when using vi and it uses up the server memory
Thank you for your help
If the extraction is simple regex matches you could use sed, but I'd recommend perl because it has additional features that you might need if the program becomes more complex and it is easier to write in the first place.
But first I'd check to see if any of the available log processing programs will already do what you want. If you are looking for summaries with counts by page/time interval/client IP, etc. they may do what you want. I like analog because it is very fast and can deal with multiple files even if the times overlap and can uncompress them on the fly http://www.analog.cx/ or download rpm from http://www.iddl.vt.edu/~jackie/analog/. Or you might like webalizer if you just have one server/file.
I have done similar programs in Perl (you can use PHP also) where, when I read the file in and using regex I select the records that I want to keep.
Feliz Año :-)
On Fri, Jan 1, 2010 at 9:14 AM, Les Mikesell lesmikesell@gmail.com wrote:
ann kok wrote:
Hi
I have apache log file around 7.6G and record half year
Which program/command (perl, vi, or sed) is better to extract the data by
date? and finally I can remove that big file and still keep the record.
I have problem when using vi and it uses up the server memory
Thank you for your help
If the extraction is simple regex matches you could use sed, but I'd recommend perl because it has additional features that you might need if the program becomes more complex and it is easier to write in the first place.
But first I'd check to see if any of the available log processing programs will already do what you want. If you are looking for summaries with counts by page/time interval/client IP, etc. they may do what you want. I like analog because it is very fast and can deal with multiple files even if the times overlap and can uncompress them on the fly http://www.analog.cx/ or download rpm from http://www.iddl.vt.edu/~jackie/analog/. Or you might like webalizer if you just have one server/file.
-- Les Mikesell lesmikesell@gmail.com _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos