[CentOS] Scripting help please....

Wed Oct 28 18:05:16 UTC 2009

I think it can be optimized, and if programing language doesn't matter:
#!/usr/bin/python

file="test.txt"
fl = open(file,'r')
toParse = fl.readlines()
fl.close()
dublicates = []
firstOne = []
for ln in toParse:
    ln=ln.strip()
    lnMap = ln.split(':')
    target = lnMap[2]
    if target in firstOne:
        if not target in dublicates:
            dublicates.append(target)
    else:
        firstOne.append(target)
for ln in toParse:
    ln = ln.strip()
    lnMap = ln.split(':')
    target = lnMap[2]
    if target in dublicates:
        print ln

On Wed, Oct 28, 2009 at 7:09 PM, Truejack <truejack at gmail.com> wrote:

> Need a scripting help to sort out a list and list all the duplicate lines.
>
> My data looks somethings like this
>
> host6:dev406mum.dd.mum.test.com:22:11:11:no
> host7:dev258mum.dd.mum.test.com:36:17:19:no
> host7:dev258mum.dd.mum.test.com:36:17:19:no
> host17:dev258mum.dd.mum.test.com:31:17:19:no
> host12:dev258mum.dd.mum.test.com:41:17:19:no
> host2:dev258mum.dd.mum.test.com:36:17:19:no
> host4:dev258mum.dd.mum.test.com:41:17:19:no
> host4:dev258mum.dd.mum.test.com:45:17:19:no
> host4:dev258mum.dd.mum.test.com:36:17:19:no
>
> I need to sort this list and print all the lines where column 3 has a
> duplicate entry.
>
> I need to print the whole line, if a duplicate entry exists in column 3.
>
> I tried using a combination of "sort" and "uniq" but was not successful.
>
>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos/attachments/20091028/923c7986/attachment.html>