I think it can be optimized, and if programing language doesn't matter: #!/usr/bin/python
file="test.txt" fl = open(file,'r') toParse = fl.readlines() fl.close() dublicates = [] firstOne = [] for ln in toParse: ln=ln.strip() lnMap = ln.split(':') target = lnMap[2] if target in firstOne: if not target in dublicates: dublicates.append(target) else: firstOne.append(target) for ln in toParse: ln = ln.strip() lnMap = ln.split(':') target = lnMap[2] if target in dublicates: print ln
On Wed, Oct 28, 2009 at 7:09 PM, Truejack truejack@gmail.com wrote:
Need a scripting help to sort out a list and list all the duplicate lines.
My data looks somethings like this
host6:dev406mum.dd.mum.test.com:22:11:11:no host7:dev258mum.dd.mum.test.com:36:17:19:no host7:dev258mum.dd.mum.test.com:36:17:19:no host17:dev258mum.dd.mum.test.com:31:17:19:no host12:dev258mum.dd.mum.test.com:41:17:19:no host2:dev258mum.dd.mum.test.com:36:17:19:no host4:dev258mum.dd.mum.test.com:41:17:19:no host4:dev258mum.dd.mum.test.com:45:17:19:no host4:dev258mum.dd.mum.test.com:36:17:19:no
I need to sort this list and print all the lines where column 3 has a duplicate entry.
I need to print the whole line, if a duplicate entry exists in column 3.
I tried using a combination of "sort" and "uniq" but was not successful.
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos