[CentOS] Scripting help please....

Wed Oct 28 17:41:22 UTC 2009



>
>From: Truejack <truejack at gmail.com>
>To: centos at centos.org
>Sent: Wed, October 28, 2009 6:09:41 PM
>Subject: [CentOS] Scripting help please....
>
>Need a scripting help to sort out a list and list all the duplicate lines.
>
>My data looks somethings like this
>
>host6:dev406mum.dd.mum.test.com:22:11:11:no
>host7:dev258mum.dd.mum.test.com:36:17:19:no
>host7:dev258mum.dd.mum.test.com:36:17:19:no
>>host17:dev258mum.dd.mum.test.com:31:17:19:no
>host12:dev258mum.dd.mum.test.com:41:17:19:no
>host2:dev258mum.dd.mum.test.com:36:17:19:no
>host4:dev258mum.dd.mum.test.com:41:17:19:no
>host4:dev258mum.dd.mum.test.com:45:17:19:no
>>host4:dev258mum.dd.mum.test.com:36:17:19:no
>
>I need to sort this list and print all the lines where column 3 has a duplicate entry.
>
>I need to print the whole line, if a duplicate entry exists in column 3.
>
>I tried using a combination of "sort" and "uniq" but was not successful.
>
>

A quick and dirty example (only prints the extra duplicate lines; not the original duplicate):
awk -F: ' { v[$3]=v[$3]+1; if (v[$3]>1) print $0; } ' datafile

JD