m.roth@5-cent.us wrote:
Need a scripting help to sort out a list and list all the duplicate lines.
My data looks somethings like this
host6:dev406mum.dd.mum.test.com:22:11:11:no host7:dev258mum.dd.mum.test.com:36:17:19:no host7:dev258mum.dd.mum.test.com:36:17:19:no host17:dev258mum.dd.mum.test.com:31:17:19:no host12:dev258mum.dd.mum.test.com:41:17:19:no host2:dev258mum.dd.mum.test.com:36:17:19:no host4:dev258mum.dd.mum.test.com:41:17:19:no host4:dev258mum.dd.mum.test.com:45:17:19:no host4:dev258mum.dd.mum.test.com:36:17:19:no
I need to sort this list and print all the lines where column 3 has a duplicate entry.
I need to print the whole line, if a duplicate entry exists in column 3.
I tried using a combination of "sort" and "uniq" but was not successful.
list.awk BEGIN { FS=":"; } { if ( $3 == last ) {
print $0;
} last = $3; }
sort <file> | awk -f list.awk
mark "*how* long an awk script would you like?"
This doesn't print the first of the duplicates. Also, the question wasn't clear as to whether every line with matching 3rd fields should be printed or just ones where the others or previous fields matched (but the sort options could control that).
Oh, sorry: BEGIN { FS=":"; } { if ( $3 == last ) { if ( first == 0 ) { print saved; first++; } print $0; } else { first = 0; last = $3; saved = $0; } }
mark "did I mention that I've written 100 -200 line awk scripts?"