[CentOS] Need help in writing a shell/bash script

Fri Dec 30 16:04:49 UTC 2011

On 12/30/2011 09:00 PM, ankush grover wrote:
> Hi Friends,
>
> I am trying to write a shell script which can merge the 2 columns into
> 3rd one on Centos 5. The file is very long around 31200 rows having
> around 1370 unique groups and around 12000 unique user-names.
> The 1st column is the groupname and then 2nd column is the user-name.
>
> 1st Column (Groupname)            2nd Column (username)
>                  admin                      ankush
>                  admin                       amit
>                  powerusers               dinesh
>                  powerusers               jitendra
>
>
>
>
> The desired output should be like this
>
> admin:   ankush, amit
> powerusers:  dinesh, jitendra
>
>
> There are commands available but not able to use it properly to get
> the desired output. Please help me

Hi Ankush,

This will do what you want. But please read the comments in the code.
As a side note, this sort of thing is way more natural in Postgres. That 
will become more apparent as the file contents grow. In particular, the 
concept of appending tens of thousands of names to a single line in a 
file is a little crazy, as most text editors will start choking on 
display without a \n in there somewhere to relieve the way most of them 
read and display text.

#######BEGIN collator.sh
#! /bin/bash
#
# collator.sh
#
# Invocation:
#   If executable and in $PATH (~/bin is a good idea):
#       collator.sh input-filename output-filename
#   If not executable, not in $PATH, but in present working directory:
#       sh ./collator.sh input-filename output-filename
#
# WARNING: There is NO serious attempt at error checking implemented.
#  This means you should check the contents of OUTFILE before
#  using it for anything important.

INFILE=${1:?"Input filename missing, please read script comments."}
OUTFILE=${2:?"Output filename missing, please read script comments."}

awk '{print $1 ": "}' $INFILE | uniq > $OUTFILE
for GROUP in `cat $OUTFILE | cut -d ':' -f 1`
     do for NAME in `cat $INFILE | grep $GROUP | awk '{print $2}'`
         do sed -i "s/^$GROUP: /&$NAME,\ /" $OUTFILE
     done
done
#######END collator.sh