[CentOS] Optimizing grep, sort, uniq for speed

Thu Jun 28 18:30:33 UTC 2012

This snippet of code pulls an array of hostnames from some log files.
It has to parse around 3GB of log files, so I'm keen on making it as
efficient as possible.  Can you think of any way to optimize this to
run faster?

HOSTS=()
for host in $(grep -h -o "[-\.0-9a-z][-\.0-9a-z]*.com" ${TMPDIR}/* |
sort | uniq); do
    HOSTS+=("$host")
done

[CentOS] Optimizing grep, sort, uniq for speed

Sean Carolan