[CentOS] Scripting help please....

On Wed, Oct 28, 2009 at 10:39:41PM +0530, Truejack wrote:
> 
>    Need a scripting help to sort out a list and list all the duplicate lines.
> 
>    My data looks somethings like this
> 
>    host6:dev406mum.dd.mum.test.com:22:11:11:no
>    host7:dev258mum.dd.mum.test.com:36:17:19:no

A key to your answer is the --all-repeated option
for uniq on a sorted file.

I call this "find-duplicates" -- this post makes it GPL

#!  /bin/bash
#SIZER=' -size +10240k'
SIZER=' -size +0'
#SIZER=""
DIRLIST=". "
find $DIRLIST  -type f $SIZER -print0 | xargs -0 md5sum |\
sort > /tmp/looking4duplicates
tput bel; sleep 2
cat /tmp/looking4duplicates |  uniq --check-chars=32 --all-repeated=prepend | less