I think it can be optimized, and if programing language doesn't matter: #!/usr/bin/python file="test.txt" fl = open(file,'r') toParse = fl.readlines() fl.close() dublicates = [] firstOne = [] for ln in toParse: ln=ln.strip() lnMap = ln.split(':') target = lnMap[2] if target in firstOne: if not target in dublicates: dublicates.append(target) else: firstOne.append(target) for ln in toParse: ln = ln.strip() lnMap = ln.split(':') target = lnMap[2] if target in dublicates: print ln On Wed, Oct 28, 2009 at 7:09 PM, Truejack <truejack at gmail.com> wrote: > Need a scripting help to sort out a list and list all the duplicate lines. > > My data looks somethings like this > > host6:dev406mum.dd.mum.test.com:22:11:11:no > host7:dev258mum.dd.mum.test.com:36:17:19:no > host7:dev258mum.dd.mum.test.com:36:17:19:no > host17:dev258mum.dd.mum.test.com:31:17:19:no > host12:dev258mum.dd.mum.test.com:41:17:19:no > host2:dev258mum.dd.mum.test.com:36:17:19:no > host4:dev258mum.dd.mum.test.com:41:17:19:no > host4:dev258mum.dd.mum.test.com:45:17:19:no > host4:dev258mum.dd.mum.test.com:36:17:19:no > > I need to sort this list and print all the lines where column 3 has a > duplicate entry. > > I need to print the whole line, if a duplicate entry exists in column 3. > > I tried using a combination of "sort" and "uniq" but was not successful. > > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos/attachments/20091028/923c7986/attachment-0005.html>