On 09/25/11 12:18 PM, Dotan Cohen wrote: > On Sun, Sep 25, 2011 at 22:06, John R Pierce<pierce at hogranch.com> wrote: >>> Is there a way to get the most common (unique) lines of the file? >> sort -k 3 | uniq -f 2 >> >> >> which will sort starting at field 3, and then print lines that are >> unique, skipping the first 2 fields, where fields by default are blank >> separated. >> > Thanks, John. This looks to me that it will sort alphabetically, not > by commonness. For instance: > ERROR b > ERROR a > ERROR b > > Since "ERROR b" was reported more often than "ERROR a", I would prefer > that the output be: > ERROR b > ERROR a > > I'm sorry for not making that so clear! Is there a good word for "most > common" or "used most often" that would be concise in this context? uniq can count occurances. will require two sorts. one to get all similar errors adjacent, the other to sort by count order. instead of using field selects, lets just clip the timestamps off up front... cut -c 17- | sort | uniq -c | sort -rn (17- means from char 17 on... I may have miscounted) -- john r pierce N 37, W 122 santa cruz ca mid-left coast