> m.roth at 5-cent.us wrote: >>> Need a scripting help to sort out a list and list all the duplicate >>> lines. >>> >>> My data looks somethings like this >>> >>> host6:dev406mum.dd.mum.test.com:22:11:11:no >>> host7:dev258mum.dd.mum.test.com:36:17:19:no >>> host7:dev258mum.dd.mum.test.com:36:17:19:no >>> host17:dev258mum.dd.mum.test.com:31:17:19:no >>> host12:dev258mum.dd.mum.test.com:41:17:19:no >>> host2:dev258mum.dd.mum.test.com:36:17:19:no >>> host4:dev258mum.dd.mum.test.com:41:17:19:no >>> host4:dev258mum.dd.mum.test.com:45:17:19:no >>> host4:dev258mum.dd.mum.test.com:36:17:19:no >>> >>> I need to sort this list and print all the lines where column 3 has a >>> duplicate entry. >>> >>> I need to print the whole line, if a duplicate entry exists in column >>> 3. >>> >>> I tried using a combination of "sort" and "uniq" but was not >>> successful. >> >> list.awk >> BEGIN { >> FS=":"; >> } >> { if ( $3 == last ) { >> >> print $0; >> } >> last = $3; >> } >> >> sort <file> | awk -f list.awk >> >> mark "*how* long an awk script would you like?" > > This doesn't print the first of the duplicates. Also, the question > wasn't clear as to whether every line with matching 3rd fields should be > printed or just ones where the others or previous fields matched (but > the sort options could control that). Oh, sorry: BEGIN { FS=":"; } { if ( $3 == last ) { if ( first == 0 ) { print saved; first++; } print $0; } else { first = 0; last = $3; saved = $0; } } mark "did I mention that I've written 100 -200 line awk scripts?"