On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address.
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
#!/bin/bash declare -A totals
while read line do IFS="\t " read -r -a elems <<< "$line" email=${elems[0]} subtotal=${elems[1]}
declare -i n=${totals[$email]} n=n+$subtotal totals[$email]=$n done < stats
for k in "${!totals[@]}" do printf "%6d %s\n" ${totals[$k]} $k done
You’re making things hard on yourself by insisting on Bash, by the way. This solution is better expressed in Perl, Python, Ruby, Lua, JavaScript…probably dozens of languages.