I know this is for CentOS stuff, but I'm at a loss on how to build a script that does what I need it to do. It's probably really logically simple, I'm just not seeing it. Hopefully someone will take pity on me and at least give me a big hint.
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address. This type of code has always been the hardest for me for whatever reason, and honestly, I don't write many scripts these days. I'm struggling to get psuedocode that works, much less a working script. I know this is off topic, and if it gets modded out, that's fine. I just can't wrap my brain around it.
On Wed, Oct 25, 2017 at 9:02 AM, Mark Haney mark.haney@neonova.net wrote:
I know this is for CentOS stuff, but I'm at a loss on how to build a script that does what I need it to do. It's probably really logically simple, I'm just not seeing it. Hopefully someone will take pity on me and at least give me a big hint.
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address. This type of code has always been the hardest for me for whatever reason, and honestly, I don't write many scripts these days. I'm struggling to get psuedocode that works, much less a working script. I know this is off topic, and if it gets modded out, that's fine. I just can't wrap my brain around it.
here is a python solution
#!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D: D[email] = D[email] + num else: D[email] = num f.close() for key in D: print key, D[key]
On 10/25/2017 12:33 PM, Robert Arkiletian wrote:
here is a python solution #!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D: D[email] = D[email] + num else: D[email] = num f.close() for key in D: print key, D[key] _______________________________________________
That gets me closer, I think. It's concatenating the number of messages, but it's a start. Thanks.
On 10/25/2017 12:41 PM, Mark Haney wrote:
On 10/25/2017 12:33 PM, Robert Arkiletian wrote:
here is a python solution #!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D: D[email] = D[email] + num else: D[email] = num f.close() for key in D: print key, D[key] _______________________________________________
That gets me closer, I think. It's concatenating the number of messages, but it's a start. Thanks.
I do this kind of thing on a fairly regular basis with a Perl one-liner:
perl -ne '($email, $num) = split; $tot{$email} += $num; END { for $email (keys %tot) { print "$email $tot{$email}\n" } }' < yourfile
On Wed, Oct 25, 2017 at 9:41 AM, Mark Haney mark.haney@neonova.net wrote:
On 10/25/2017 12:33 PM, Robert Arkiletian wrote:
here is a python solution #!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D:
D[email] = D[email] + int(num)
else:
D[email] = int(num)
f.close() for key in D: print key, D[key] _______________________________________________
That gets me closer, I think. It's concatenating the number of messages, but it's a start. Thanks.
--
Mark Haney Network Engineer at NeoNova 919-460-3330 option 1 mark.haney@neonova.net www.neonova.net
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
On Wed, Oct 25, 2017 at 9:59 AM, Robert Arkiletian robark@gmail.com wrote:
On Wed, Oct 25, 2017 at 9:41 AM, Mark Haney mark.haney@neonova.net wrote:
On 10/25/2017 12:33 PM, Robert Arkiletian wrote:
here is a python solution #!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D:
D[email] = D[email] + int(num)
else:
D[email] = int(num)
f.close()
not to be outdone, python can sort them based on the totals
for k in sorted(D, key=d.get, reverse=True): print k, D[k]
On Wed, Oct 25, 2017 at 12:08 PM, Robert Arkiletian robark@gmail.com wrote:
On Wed, Oct 25, 2017 at 9:59 AM, Robert Arkiletian robark@gmail.com wrote:
On Wed, Oct 25, 2017 at 9:41 AM, Mark Haney mark.haney@neonova.net wrote:
On 10/25/2017 12:33 PM, Robert Arkiletian wrote:
here is a python solution #!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D:
D[email] = D[email] + int(num)
else:
D[email] = int(num)
f.close()
not to be outdone, python can sort them based on the totals
for k in sorted(D, key=d.get, reverse=True):
oops. that's a capital D.get
for k in sorted(D, key=D.get, reverse=True):
print k, D[k]
On Wed, 2017-10-25 at 12:02 -0400, Mark Haney wrote:
I know this is for CentOS stuff, but I'm at a loss on how to build a script that does what I need it to do. It's probably really logically simple, I'm just not seeing it. Hopefully someone will take pity on me and at least give me a big hint.
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address. This type of code has always been the hardest for me for whatever reason, and honestly, I don't write many scripts these days. I'm struggling to get psuedocode that works, much less a working script. I know this is off topic, and if it gets modded out, that's fine. I just can't wrap my brain around it.
Not bash but perl:
##### #!/usr/bin/perl my %dd; while (<>) { my @f=split; $dd{$f[0]}{COUNT}+=$f[1]; } print "\nSums:\n"; for (keys %dd) { print "$_\t $dd{$_}{COUNT}\n"; }; ####
It takes the data on stdin, sums it into an associative array and prints out the result
Results: ###### $ ./ppp me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
Sums: you@domain.com 130 me@example.com 60 ######
I'm sure some perl monk can come up with a single line command to do the same thing.
P.
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address.
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
#!/bin/bash declare -A totals
while read line do IFS="\t " read -r -a elems <<< "$line" email=${elems[0]} subtotal=${elems[1]}
declare -i n=${totals[$email]} n=n+$subtotal totals[$email]=$n done < stats
for k in "${!totals[@]}" do printf "%6d %s\n" ${totals[$k]} $k done
You’re making things hard on yourself by insisting on Bash, by the way. This solution is better expressed in Perl, Python, Ruby, Lua, JavaScript…probably dozens of languages.
Although "not my question", thanks, I learned a lot about array processing from your example.
----- Original Message ----- From: "warren" warren@etr-usa.com To: "centos" centos@centos.org Sent: Wednesday, October 25, 2017 11:47:12 AM Subject: Re: [CentOS] [OT] Bash help
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address.
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
#!/bin/bash declare -A totals
while read line do IFS="\t " read -r -a elems <<< "$line" email=${elems[0]} subtotal=${elems[1]}
declare -i n=${totals[$email]} n=n+$subtotal totals[$email]=$n done < stats
for k in "${!totals[@]}" do printf "%6d %s\n" ${totals[$k]} $k done
You’re making things hard on yourself by insisting on Bash, by the way. This solution is better expressed in Perl, Python, Ruby, Lua, JavaScript…probably dozens of languages. _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
On Oct 25, 2017, at 11:00 AM, Leroy Tennison leroy@datavoiceint.com wrote:
Although "not my question", thanks, I learned a lot about array processing from your example.
Yeah, it’s amazing how many obscure corners of the Bash language must be tapped to solve such a simple problem. I count 7 features in that script that I almost never use, because I’d have just written this one in Perl if not required to write it in Bash by the OP.
I expect that’s why the features are obscure to you, too: once you need to step beyond POSIX 1988 shell levels, most people just switch to some more powerful language, owing to the dark days when even a POSIX shell was sometimes tricky to find, much less a post-POSIX shell. (Can you say /usr/xpg4/bin/sh ? Yyyeahh…)
That situation threw a long shadow over the shell scripting landscape, where relatively few dare to tread, even today.
Warren Young wrote:
On Oct 25, 2017, at 11:00 AM, Leroy Tennison leroy@datavoiceint.com wrote:
Although "not my question", thanks, I learned a lot about array processing from your example.
Yeah, it’s amazing how many obscure corners of the Bash language must be tapped to solve such a simple problem. I count 7 features in that script that I almost never use, because I’d have just written this one in Perl if not required to write it in Bash by the OP.
<snip> Let me say this: among the many reasons I like *Nix: in any other o/s, it's "how co I create this report, and it takes from 2 days to 2 weeks. In *Nix, it's "of all the ways I can create this report, how would I *prefer* to do it...."
No kidding, but in that "other OS" the answer to the question "how can I create that report" is usually "You can't unless you spend money for a third-party application".
----- Original Message ----- From: "m roth" m.roth@5-cent.us To: "centos" centos@centos.org Sent: Wednesday, October 25, 2017 12:27:28 PM Subject: Re: [CentOS] [OT] Bash help
Warren Young wrote:
On Oct 25, 2017, at 11:00 AM, Leroy Tennison leroy@datavoiceint.com wrote:
Although "not my question", thanks, I learned a lot about array processing from your example.
Yeah, it’s amazing how many obscure corners of the Bash language must be tapped to solve such a simple problem. I count 7 features in that script that I almost never use, because I’d have just written this one in Perl if not required to write it in Bash by the OP.
<snip> Let me say this: among the many reasons I like *Nix: in any other o/s, it's "how co I create this report, and it takes from 2 days to 2 weeks. In *Nix, it's "of all the ways I can create this report, how would I *prefer* to do it...."
_______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Leroy Tennison wrote:
No kidding, but in that "other OS" the answer to the question "how can I create that report" is usually "You can't unless you spend money for a third-party application".
"Other", singluar? Did you mean WinDoze, or on an IBM mainframe, or...?
mark "been around the block"
----- Original Message ----- From: "m roth" m.roth@5-cent.us To: "centos" centos@centos.org Sent: Wednesday, October 25, 2017 12:27:28 PM Subject: Re: [CentOS] [OT] Bash help
Warren Young wrote:
On Oct 25, 2017, at 11:00 AM, Leroy Tennison leroy@datavoiceint.com wrote:
Although "not my question", thanks, I learned a lot about array processing from your example.
Yeah, it’s amazing how many obscure corners of the Bash language must be tapped to solve such a simple problem. I count 7 features in that script that I almost never use, because I’d have just written this one in Perl if not required to write it in Bash by the OP.
<snip> Let me say this: among the many reasons I like *Nix: in any other o/s, it's "how co I create this report, and it takes from 2 days to 2 weeks. In *Nix, it's "of all the ways I can create this report, how would I *prefer* to do it...."
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Not enough experience with the mainframe: I meant WinDoze.
----- Original Message ----- From: "m roth" m.roth@5-cent.us To: "centos" centos@centos.org Sent: Wednesday, October 25, 2017 1:02:54 PM Subject: Re: [CentOS] [OT] Bash help
Leroy Tennison wrote:
No kidding, but in that "other OS" the answer to the question "how can I create that report" is usually "You can't unless you spend money for a third-party application".
"Other", singluar? Did you mean WinDoze, or on an IBM mainframe, or...?
mark "been around the block"
----- Original Message ----- From: "m roth" m.roth@5-cent.us To: "centos" centos@centos.org Sent: Wednesday, October 25, 2017 12:27:28 PM Subject: Re: [CentOS] [OT] Bash help
Warren Young wrote:
On Oct 25, 2017, at 11:00 AM, Leroy Tennison leroy@datavoiceint.com wrote:
Although "not my question", thanks, I learned a lot about array processing from your example.
Yeah, it’s amazing how many obscure corners of the Bash language must be tapped to solve such a simple problem. I count 7 features in that script that I almost never use, because I’d have just written this one in Perl if not required to write it in Bash by the OP.
<snip> Let me say this: among the many reasons I like *Nix: in any other o/s, it's "how co I create this report, and it takes from 2 days to 2 weeks. In *Nix, it's "of all the ways I can create this report, how would I *prefer* to do it...."
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
_______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Warren Young wrote:
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address.
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
<snip> Associative arrays?
Awk! Awk! (No, I am not a seagull...)
sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
mark "associative arrays, how do I love thee? Let me tot the arrays..."
On 10/25/2017 01:24 PM, m.roth@5-cent.us wrote:
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
<snip> Associative arrays?
Awk! Awk! (No, I am not a seagull...)
sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
mark "associative arrays, how do I love thee? Let me tot the arrays..."
Okay, I'm impressed with this one. I use awk for simple stuff when sed starts getting weird, but this is absolutely elegant. No offense to the other examples, they are all awesome, but I had no idea awk could do this with such little effort. Well, I know what I'm studying up on this weekend.
Mark Haney wrote:
On 10/25/2017 01:24 PM, m.roth@5-cent.us wrote:
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
<snip> Associative arrays?
Awk! Awk! (No, I am not a seagull...)
sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
mark "associative arrays, how do I love thee? Let me tot the
arrays..."
Okay, I'm impressed with this one. I use awk for simple stuff when sed starts getting weird, but this is absolutely elegant. No offense to the other examples, they are all awesome, but I had no idea awk could do this with such little effort. Well, I know what I'm studying up on this weekend.
The perl script was about the same. It's just, well, I learned awk when I first got into *nix, in '91. Had a project where We were going to be the center and Tell All Agencies The Format of the data they would give us, and we'd load a d/b.... I wrote the d/b loader in C..and then they all said, "sorry, no budget for that, here's the format we've got it in, ya want it or not?"
Before that project finished, I had 30 awk scripts, ranging in length from 100-200 lines (yes, really), to reformat, and validate the data before feeding it to the loader I'd written. The other thing - there may be more succinct ways to write it (my manager, these days, uses regular expressions to the point I have to look what it's doing up), while more than half my career was as a programmer, and I write code such that if I get hit by a car, or take another job, or get called at 16:30 on a Friday, or 02:00, I want to fix the problem without spending hours trying to remember how clever I'd been last year... so I make it easily readable and comprehensible.
awk is just fun.
mark
hrm.. seems like you were missing a }
sort file | awk '{array[$1] += $2;} END { for (i in array) {print i "\t" array[i];}}'
regards,
Jason
On 10/25/2017 01:24 PM, m.roth@5-cent.us wrote:
Warren Young wrote:
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address.
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
<snip> Associative arrays?
Awk! Awk! (No, I am not a seagull...)
sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
mark "associative arrays, how do I love thee? Let me tot the arrays..."
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Jason Welsh wrote:
hrm.. seems like you were missing a }
sort file | awk '{array[$1] += $2;} END { for (i in array) {print i "\t" array[i];}}'
Oops. Well, it's not vi, it's webmail, so I couldn't check... <g> Thanks.
mark
regards,
Jason
On 10/25/2017 01:24 PM, m.roth@5-cent.us wrote:
Warren Young wrote:
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address.
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
<snip> Associative arrays?
Awk! Awk! (No, I am not a seagull...)
sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
mark "associative arrays, how do I love thee? Let me tot the
arrays..."
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
In article b5215baacd93a6e85efc59947f9b8ed9.squirrel@host290.hostmonster.com, m.roth@5-cent.us wrote:
Warren Young wrote:
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address.
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
<snip> Associative arrays?
Awk! Awk! (No, I am not a seagull...)
sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
Why the sort? It doesn't matter in what order the lines are read. Wouldn't this give you the same?
awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}}' <file
Cheers Tony
Tony Mountifield wrote:
In article b5215baacd93a6e85efc59947f9b8ed9.squirrel@host290.hostmonster.com, m.roth@5-cent.us wrote:
Warren Young wrote:
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net
wrote:
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address.
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5
is
definitely out, as that ships Bash 3, which lacks this feature.
<snip> Associative arrays?
Awk! Awk! (No, I am not a seagull...)
sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
Why the sort? It doesn't matter in what order the lines are read. Wouldn't this give you the same?
awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}}' <file
You're right, not really necessary in this case. I was working with a couple of awk scripts here at work, and it was needed in the middle....
mark
This thread started as "I'm not sure if this is offtopic" and it ended as such a great and fun to read discussion. Thank you all for these great script examples. I really enjoyed reading it.
On 2017-Oct-25 17:10, m.roth@5-cent.us wrote:
Tony Mountifield wrote:
In article b5215baacd93a6e85efc59947f9b8ed9.squirrel@host290.hostmonster.com, m.roth@5-cent.us wrote:
Warren Young wrote:
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net
wrote:
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address.
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5
is
definitely out, as that ships Bash 3, which lacks this feature.
<snip> Associative arrays?
Awk! Awk! (No, I am not a seagull...)
sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
Why the sort? It doesn't matter in what order the lines are read. Wouldn't this give you the same?
awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}}' <file
You're right, not really necessary in this case. I was working with a couple of awk scripts here at work, and it was needed in the middle....
mark
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
On 10/25/2017 12:47 PM, Warren Young wrote:
You’re making things hard on yourself by insisting on Bash, by the way. This solution is better expressed in Perl, Python, Ruby, Lua, JavaScript…probably dozens of languages.
Yeah, you're right, I am. An associative array was the first thing I thought of, then realized BASH doesn't do those. I honestly expected there to be a fairly straight forward way to do it in BASH, but I was sadly mistaken. In my defense, I gave virtually no thought on the logic of what I was trying to do until after I'd committed significant time to a BASH script. (Well maybe that's not a defense, but an indictment.)
As I said, I don't do much scripting anymore as the majority of my time is spent DB tuning and Ansible automation. Not really an excuse, and I appreciate your indulgence(s) in giving me a hand. As embarrassed as I am, I'll just go sit in the corner the rest of the day.
Thanks again.
On Oct 25, 2017, at 11:28 AM, Mark Haney mark.haney@neonova.net wrote:
An associative array was the first thing I thought of, then realized BASH doesn't do those.
But it does: in Bash 4, only.
If you mean you must still use Bash 3 in places, then yeah, you’ve got a problem… one probably best solved by switching to some other language once the program grows beyond Bash 3’s natural scope.
I was trying to think of which languages I know well which require even more difficult solutions than the Bash 4 one. It’s a pretty short list: assembly, C, and MS-DOS batch files. By “C” I’m including anything of its era and outlook: Pascal, Fortran…
I think even Tcl beats Bash 4 on this score, and it’s notoriously minimal in its feature set.
Here’s a brain-bender: You could probably do it with sqlite3 with fewer lines of code than my Bash 4 offering. :)
I honestly expected there to be a fairly straight forward way to do it in BASH, but I was sadly mistaken.
Oh, I don’t know, there must be a way to do it without associative arrays, but you’d only get points for the masochism value in doing without.
On 10/25/2017 3:34 PM, Warren Young wrote:
On Oct 25, 2017, at 11:28 AM, Mark Haney mark.haney@neonova.net wrote:
An associative array was the first thing I thought of, then realized BASH doesn't do those.
But it does: in Bash 4, only.
If you mean you must still use Bash 3 in places, then yeah, you’ve got a problem… one probably best solved by switching to some other language once the program grows beyond Bash 3’s natural scope.
I was trying to think of which languages I know well which require even more difficult solutions than the Bash 4 one. It’s a pretty short list: assembly, C, and MS-DOS batch files. By “C” I’m including anything of its era and outlook: Pascal, Fortran…
I think even Tcl beats Bash 4 on this score, and it’s notoriously minimal in its feature set.
Here’s a brain-bender: You could probably do it with sqlite3 with fewer lines of code than my Bash 4 offering. :)
I honestly expected there to be a fairly straight forward way to do it in BASH, but I was sadly mistaken.
Oh, I don’t know, there must be a way to do it without associative arrays, but you’d only get points for the masochism value in doing without.
Array N holds the names and array T holds the totals. For each line in the file, you iterate through N to find the name and then add the number to the same index in T (or create a new entry in both arrays if you don't find it). Then you just have to iterate through both arrays and print off the names from N and the totals from T. It's a pain, but it's doable.
Sorry, I'm too lazy to write code for this... :)
Once upon a time, Warren Young warren@etr-usa.com said:
I was trying to think of which languages I know well which require even more difficult solutions than the Bash 4 one. It’s a pretty short list: assembly, C, and MS-DOS batch files. By “C” I’m including anything of its era and outlook: Pascal, Fortran…
Heh, even C on SVR4 and newer (including POSIX from 2001) have pretty straight-forward hash routines: hcreate(), hsearch(), and hdestroy().
On Wed, Oct 25, 2017 at 10:47:12AM -0600, Warren Young wrote:
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:
I have a file with two columns 'email' and 'total' like this:
me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30
I need to get the total number of messages for each email address.
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
#!/bin/bash declare -A totals
while read line do IFS="\t " read -r -a elems <<< "$line" email=${elems[0]} subtotal=${elems[1]}
declare -i n=${totals[$email]} n=n+$subtotal totals[$email]=$n
done < stats
for k in "${!totals[@]}" do printf "%6d %s\n" ${totals[$k]} $k done
A slightly different approach written for ksh but seems to also work with bash 4.
typeset -A arr
while read addr cnt do arr[$addr]=$(( ${arr[$addr]:-0} + cnt)) done < ${1}
for a in ${!arr[*]} do printf "%6d %s\n" ${arr[$a]} $a done
Jon
You’re making things hard on yourself by insisting on Bash, by the way.
I'd always assumed that shell scripting was a kind of sado masochistic medium allowing people who don't get out much to inflict horrible torture on each other. It certainly causes me great pain every time I try and read a bash script with more than a couple of clauses.
I'm just taking over a bunch of bash CI plumbing that seems to have been written by a committee of Manson family members.
On Wed, Oct 25, 2017 at 9:47 AM, Warren Young warren@etr-usa.com wrote:
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)
That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
Nonsense. Every POSIX shell has an associative array called "the filesystem."
(hash=$(mktemp -d); while read addr msgs; do echo $msgs >> "$hash/$addr"; done; cd "$hash"; for x in *; do echo "$x $(paste -s -d+ < $x | bc)"; done;) < msg-counts
On Oct 26, 2017, at 10:37 AM, Gordon Messmer gordon.messmer@gmail.com wrote:
On Wed, Oct 25, 2017 at 9:47 AM, Warren Young warren@etr-usa.com wrote:
CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.
Nonsense. Every POSIX shell has an associative array called "the filesystem.”
Ah, *there’s* our masochist. I knew we had at least one around here somewhere. :)
Takes one to know one, I suppose: not long ago, I proposed using the filesystem to implement a [DAG][1] in shell. ;)