[CentOS] Optimizing grep, sort, uniq for speed

Sean Carolan

scarolan at gmail.com
Thu Jun 28 18:30:33 UTC 2012


This snippet of code pulls an array of hostnames from some log files.
It has to parse around 3GB of log files, so I'm keen on making it as
efficient as possible.  Can you think of any way to optimize this to
run faster?

HOSTS=()
for host in $(grep -h -o "[-\.0-9a-z][-\.0-9a-z]*.com" ${TMPDIR}/* |
sort | uniq); do
    HOSTS+=("$host")
done



More information about the CentOS mailing list