On Wed, Oct 25, 2017 at 10:47:12AM -0600, Warren Young wrote:
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address.
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
#!/bin/bash declare -A totals
while read line do IFS="\t " read -r -a elems <<< "$line" email=${elems[0]} subtotal=${elems[1]}
declare -i n=${totals[$email]} n=n+$subtotal totals[$email]=$n
done < stats
for k in "${!totals[@]}" do printf "%6d %s\n" ${totals[$k]} $k done
A slightly different approach written for ksh but seems to also work with bash 4.
typeset -A arr
while read addr cnt do arr[$addr]=$(( ${arr[$addr]:-0} + cnt)) done < ${1}
for a in ${!arr[*]} do printf "%6d %s\n" ${arr[$a]} $a done
Jon