[OT] Bash help

List overview All Threads
Download

newer

older

CentOS-announce Digest, Vol 152,...

Transferring Thunderbird mail...

Mark Haney

25 Oct 2017 25 Oct '17

4:02 p.m.

I know this is for CentOS stuff, but I'm at a loss on how to build a script that does what I need it to do. It's probably really logically simple, I'm just not seeing it. Hopefully someone will take pity on me and at least give me a big hint.

I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address. This type of code has always been the hardest for me for whatever reason, and honestly, I don't write many scripts these days. I'm struggling to get psuedocode that works, much less a working script. I know this is off topic, and if it gets modded out, that's fine. I just can't wrap my brain around it.

-- Mark Haney Network Engineer at NeoNova 919-460-3330 option 1 mark.haney@neonova.net www.neonova.net

Show replies by date

Robert Arkiletian

25 Oct 25 Oct

4:33 p.m.

On Wed, Oct 25, 2017 at 9:02 AM, Mark Haney mark.haney@neonova.net wrote:

...

I know this is for CentOS stuff, but I'm at a loss on how to build a script that does what I need it to do. It's probably really logically simple, I'm just not seeing it. Hopefully someone will take pity on me and at least give me a big hint.

I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address. This type of code has always been the hardest for me for whatever reason, and honestly, I don't write many scripts these days. I'm struggling to get psuedocode that works, much less a working script. I know this is off topic, and if it gets modded out, that's fine. I just can't wrap my brain around it.

here is a python solution

#!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D: D[email] = D[email] + num else: D[email] = num f.close() for key in D: print key, D[key]

Mark Haney

4:41 p.m.

On 10/25/2017 12:33 PM, Robert Arkiletian wrote:

...

here is a python solution #!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D: D[email] = D[email] + num else: D[email] = num f.close() for key in D: print key, D[key] _______________________________________________

That gets me closer, I think. It's concatenating the number of messages, but it's a start. Thanks.

-- Mark Haney Network Engineer at NeoNova 919-460-3330 option 1 mark.haney@neonova.net www.neonova.net

Bowie Bailey

4:45 p.m.

On 10/25/2017 12:41 PM, Mark Haney wrote:

...

On 10/25/2017 12:33 PM, Robert Arkiletian wrote:

...
here is a python solution #!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D: D[email] = D[email] + num else: D[email] = num f.close() for key in D: print key, D[key] _______________________________________________

That gets me closer, I think. It's concatenating the number of messages, but it's a start. Thanks.

I do this kind of thing on a fairly regular basis with a Perl one-liner:

perl -ne '($email, $num) = split; $tot{$email} += $num; END { for $email (keys %tot) { print "$email $tot{$email}\n" } }' < yourfile

-- Bowie

Robert Arkiletian

4:59 p.m.

On Wed, Oct 25, 2017 at 9:41 AM, Mark Haney mark.haney@neonova.net wrote:

...

On 10/25/2017 12:33 PM, Robert Arkiletian wrote:

...
here is a python solution #!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D:

...

     D[email] = D[email] + int(num)

...

 else:

...

     D[email] = int(num)

...

...
f.close() for key in D: print key, D[key] _______________________________________________

That gets me closer, I think. It's concatenating the number of messages, but it's a start. Thanks.

--

Mark Haney Network Engineer at NeoNova 919-460-3330 option 1 mark.haney@neonova.net www.neonova.net

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

Robert Arkiletian

7:08 p.m.

On Wed, Oct 25, 2017 at 9:59 AM, Robert Arkiletian robark@gmail.com wrote:

...

On Wed, Oct 25, 2017 at 9:41 AM, Mark Haney mark.haney@neonova.net wrote:

...
On 10/25/2017 12:33 PM, Robert Arkiletian wrote:

...
here is a python solution #!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D:

...
...
     D[email] = D[email] + int(num)
...
...
 else:
...
...
     D[email] = int(num)
...
...
f.close()

not to be outdone, python can sort them based on the totals

for k in sorted(D, key=d.get, reverse=True): print k, D[k]

Robert Arkiletian

7:13 p.m.

On Wed, Oct 25, 2017 at 12:08 PM, Robert Arkiletian robark@gmail.com wrote:

...

On Wed, Oct 25, 2017 at 9:59 AM, Robert Arkiletian robark@gmail.com wrote:

...
On Wed, Oct 25, 2017 at 9:41 AM, Mark Haney mark.haney@neonova.net wrote:

...
On 10/25/2017 12:33 PM, Robert Arkiletian wrote:

...
here is a python solution #!/usr/bin/python #python 2 (did not check if it works) f=open('yourfilename') D={} for line in f: email,num = line.split() if email in D:

...
...
     D[email] = D[email] + int(num)
...
...
 else:
...
...
     D[email] = int(num)
...
...
f.close()
not to be outdone, python can sort them based on the totals

for k in sorted(D, key=d.get, reverse=True):

oops. that's a capital D.get

for k in sorted(D, key=D.get, reverse=True):

...

print k, D[k]

Pete Biggs

4:44 p.m.

On Wed, 2017-10-25 at 12:02 -0400, Mark Haney wrote:

...

I know this is for CentOS stuff, but I'm at a loss on how to build a script that does what I need it to do. It's probably really logically simple, I'm just not seeing it. Hopefully someone will take pity on me and at least give me a big hint.

I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address. This type of code has always been the hardest for me for whatever reason, and honestly, I don't write many scripts these days. I'm struggling to get psuedocode that works, much less a working script. I know this is off topic, and if it gets modded out, that's fine. I just can't wrap my brain around it.

Not bash but perl:

##### #!/usr/bin/perl my %dd; while (<>) { my @f=split; $dd{$f[0]}{COUNT}+=$f[1]; } print "\nSums:\n"; for (keys %dd) { print "$_\t $dd{$_}{COUNT}\n"; }; ####

It takes the data on stdin, sums it into an associative array and prints out the result

Results: ###### $ ./ppp me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

Sums: you@domain.com 130 me@example.com 60 ######

I'm sure some perl monk can come up with a single line command to do the same thing.

Warren Young

4:47 p.m.

On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:

...

I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address.

This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.

#!/bin/bash declare -A totals

while read line do IFS="\t " read -r -a elems <<< "$line" email=${elems[0]} subtotal=${elems[1]}

declare -i n=${totals[$email]} n=n+$subtotal totals[$email]=$n done < stats

for k in "${!totals[@]}" do printf "%6d %s\n" ${totals[$k]} $k done

You’re making things hard on yourself by insisting on Bash, by the way. This solution is better expressed in Perl, Python, Ruby, Lua, JavaScript…probably dozens of languages.

Leroy Tennison

5 p.m.

Although "not my question", thanks, I learned a lot about array processing from your example.

----- Original Message ----- From: "warren" warren@etr-usa.com To: "centos" centos@centos.org Sent: Wednesday, October 25, 2017 11:47:12 AM Subject: Re: [CentOS] [OT] Bash help

On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:

...

I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address.

This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.

#!/bin/bash declare -A totals

while read line do IFS="\t " read -r -a elems <<< "$line" email=${elems[0]} subtotal=${elems[1]}

declare -i n=${totals[$email]} n=n+$subtotal totals[$email]=$n done < stats

for k in "${!totals[@]}" do printf "%6d %s\n" ${totals[$k]} $k done

You’re making things hard on yourself by insisting on Bash, by the way. This solution is better expressed in Perl, Python, Ruby, Lua, JavaScript…probably dozens of languages. _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

Warren Young

5:12 p.m.

On Oct 25, 2017, at 11:00 AM, Leroy Tennison leroy@datavoiceint.com wrote:

...

Although "not my question", thanks, I learned a lot about array processing from your example.

Yeah, it’s amazing how many obscure corners of the Bash language must be tapped to solve such a simple problem. I count 7 features in that script that I almost never use, because I’d have just written this one in Perl if not required to write it in Bash by the OP.

I expect that’s why the features are obscure to you, too: once you need to step beyond POSIX 1988 shell levels, most people just switch to some more powerful language, owing to the dark days when even a POSIX shell was sometimes tricky to find, much less a post-POSIX shell. (Can you say /usr/xpg4/bin/sh ? Yyyeahh…)

That situation threw a long shadow over the shell scripting landscape, where relatively few dare to tread, even today.

m.roth＠5-cent.us

5:27 p.m.

Warren Young wrote:

...

On Oct 25, 2017, at 11:00 AM, Leroy Tennison leroy@datavoiceint.com wrote:

...
Although "not my question", thanks, I learned a lot about array processing from your example.

Yeah, it’s amazing how many obscure corners of the Bash language must be tapped to solve such a simple problem. I count 7 features in that script that I almost never use, because I’d have just written this one in Perl if not required to write it in Bash by the OP.

<snip> Let me say this: among the many reasons I like *Nix: in any other o/s, it's "how co I create this report, and it takes from 2 days to 2 weeks. In *Nix, it's "of all the ways I can create this report, how would I *prefer* to do it...."

Leroy Tennison

5:37 p.m.

No kidding, but in that "other OS" the answer to the question "how can I create that report" is usually "You can't unless you spend money for a third-party application".

----- Original Message ----- From: "m roth" m.roth@5-cent.us To: "centos" centos@centos.org Sent: Wednesday, October 25, 2017 12:27:28 PM Subject: Re: [CentOS] [OT] Bash help

Warren Young wrote:

...

On Oct 25, 2017, at 11:00 AM, Leroy Tennison leroy@datavoiceint.com wrote:

...
Although "not my question", thanks, I learned a lot about array processing from your example.

Yeah, it’s amazing how many obscure corners of the Bash language must be tapped to solve such a simple problem. I count 7 features in that script that I almost never use, because I’d have just written this one in Perl if not required to write it in Bash by the OP.

_______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

m.roth＠5-cent.us

6:02 p.m.

Leroy Tennison wrote:

...

No kidding, but in that "other OS" the answer to the question "how can I create that report" is usually "You can't unless you spend money for a third-party application".

"Other", singluar? Did you mean WinDoze, or on an IBM mainframe, or...?

mark "been around the block"

...

----- Original Message ----- From: "m roth" m.roth@5-cent.us To: "centos" centos@centos.org Sent: Wednesday, October 25, 2017 12:27:28 PM Subject: Re: [CentOS] [OT] Bash help

Warren Young wrote:

...
On Oct 25, 2017, at 11:00 AM, Leroy Tennison leroy@datavoiceint.com wrote:

...
Although "not my question", thanks, I learned a lot about array processing from your example.

Yeah, it’s amazing how many obscure corners of the Bash language must be tapped to solve such a simple problem. I count 7 features in that script that I almost never use, because I’d have just written this one in Perl if not required to write it in Bash by the OP.

<snip> Let me say this: among the many reasons I like *Nix: in any other o/s, it's "how co I create this report, and it takes from 2 days to 2 weeks. In *Nix, it's "of all the ways I can create this report, how would I *prefer* to do it...."

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

Leroy Tennison

6:27 p.m.

Not enough experience with the mainframe: I meant WinDoze.

----- Original Message ----- From: "m roth" m.roth@5-cent.us To: "centos" centos@centos.org Sent: Wednesday, October 25, 2017 1:02:54 PM Subject: Re: [CentOS] [OT] Bash help

Leroy Tennison wrote:

...

No kidding, but in that "other OS" the answer to the question "how can I create that report" is usually "You can't unless you spend money for a third-party application".

"Other", singluar? Did you mean WinDoze, or on an IBM mainframe, or...?

mark "been around the block"

...

----- Original Message ----- From: "m roth" m.roth@5-cent.us To: "centos" centos@centos.org Sent: Wednesday, October 25, 2017 12:27:28 PM Subject: Re: [CentOS] [OT] Bash help

Warren Young wrote:

...
On Oct 25, 2017, at 11:00 AM, Leroy Tennison leroy@datavoiceint.com wrote:

...
Although "not my question", thanks, I learned a lot about array processing from your example.

Yeah, it’s amazing how many obscure corners of the Bash language must be tapped to solve such a simple problem. I count 7 features in that script that I almost never use, because I’d have just written this one in Perl if not required to write it in Bash by the OP.

<snip> Let me say this: among the many reasons I like *Nix: in any other o/s, it's "how co I create this report, and it takes from 2 days to 2 weeks. In *Nix, it's "of all the ways I can create this report, how would I *prefer* to do it...."

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

_______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

m.roth＠5-cent.us

5:24 p.m.

Warren Young wrote:

...

On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:

...
I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address.

This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.

<snip> Associative arrays?

Awk! Awk! (No, I am not a seagull...)

sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'

mark "associative arrays, how do I love thee? Let me tot the arrays..."

Mark Haney

5:48 p.m.

On 10/25/2017 01:24 PM, m.roth@5-cent.us wrote:

...

...
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.

<snip> Associative arrays?

Awk! Awk! (No, I am not a seagull...)

sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
   mark "associative arrays, how do I love thee? Let me tot the arrays..."

Okay, I'm impressed with this one. I use awk for simple stuff when sed starts getting weird, but this is absolutely elegant. No offense to the other examples, they are all awesome, but I had no idea awk could do this with such little effort. Well, I know what I'm studying up on this weekend.

-- Mark Haney Network Engineer at NeoNova 919-460-3330 option 1 mark.haney@neonova.net www.neonova.net

m.roth＠5-cent.us

6:01 p.m.

Mark Haney wrote:

...

On 10/25/2017 01:24 PM, m.roth@5-cent.us wrote:

...
...
This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.

<snip> Associative arrays?

Awk! Awk! (No, I am not a seagull...)

sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
   mark "associative arrays, how do I love thee? Let me tot the
arrays..."
Okay, I'm impressed with this one. I use awk for simple stuff when sed starts getting weird, but this is absolutely elegant. No offense to the other examples, they are all awesome, but I had no idea awk could do this with such little effort. Well, I know what I'm studying up on this weekend.

The perl script was about the same. It's just, well, I learned awk when I first got into *nix, in '91. Had a project where We were going to be the center and Tell All Agencies The Format of the data they would give us, and we'd load a d/b.... I wrote the d/b loader in C..and then they all said, "sorry, no budget for that, here's the format we've got it in, ya want it or not?"

Before that project finished, I had 30 awk scripts, ranging in length from 100-200 lines (yes, really), to reformat, and validate the data before feeding it to the loader I'd written. The other thing - there may be more succinct ways to write it (my manager, these days, uses regular expressions to the point I have to look what it's doing up), while more than half my career was as a programmer, and I write code such that if I get hit by a car, or take another job, or get called at 16:30 on a Friday, or 02:00, I want to fix the problem without spending hours trying to remember how clever I'd been last year... so I make it easily readable and comprehensible.

awk is just fun.

mark

Jason Welsh

6:49 p.m.

hrm.. seems like you were missing a }

sort file | awk '{array[$1] += $2;} END { for (i in array) {print i "\t" array[i];}}'

regards,

Jason

On 10/25/2017 01:24 PM, m.roth@5-cent.us wrote:

...

Warren Young wrote:

...
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:

...
I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address.

This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.

<snip> Associative arrays?

Awk! Awk! (No, I am not a seagull...)

sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
   mark "associative arrays, how do I love thee? Let me tot the arrays..."
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

m.roth＠5-cent.us

7:02 p.m.

Jason Welsh wrote:

...

hrm.. seems like you were missing a }

sort file | awk '{array[$1] += $2;} END { for (i in array) {print i "\t" array[i];}}'

Oops. Well, it's not vi, it's webmail, so I couldn't check... <g> Thanks.

mark

...

regards,

Jason

On 10/25/2017 01:24 PM, m.roth@5-cent.us wrote:

...
Warren Young wrote:

...
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:

...
I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address.

This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.

<snip> Associative arrays?

Awk! Awk! (No, I am not a seagull...)

sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'
   mark "associative arrays, how do I love thee? Let me tot the
arrays..."

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

tony＠softins.co.uk

7:24 p.m.

In article b5215baacd93a6e85efc59947f9b8ed9.squirrel@host290.hostmonster.com, m.roth@5-cent.us wrote:

...

Warren Young wrote:

...
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:

...
I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address.

This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.

<snip> Associative arrays?

Awk! Awk! (No, I am not a seagull...)

sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'

Why the sort? It doesn't matter in what order the lines are read. Wouldn't this give you the same?

awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}}' <file

Cheers Tony

-- Tony Mountifield Work: tony@softins.co.uk - http://www.softins.co.uk Play: tony@mountifield.org - http://tony.mountifield.org

m.roth＠5-cent.us

9:10 p.m.

Tony Mountifield wrote:

...

In article b5215baacd93a6e85efc59947f9b8ed9.squirrel@host290.hostmonster.com, m.roth@5-cent.us wrote:

...
Warren Young wrote:

...
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net

wrote:

...
...
I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address.

This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5

is

...
definitely out, as that ships Bash 3, which lacks this feature.

<snip> Associative arrays?

Awk! Awk! (No, I am not a seagull...)

sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'

Why the sort? It doesn't matter in what order the lines are read. Wouldn't this give you the same?

awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}}' <file

You're right, not really necessary in this case. I was working with a couple of awk scripts here at work, and it was needed in the middle....

mark

Proxy

27 Oct 27 Oct

8:07 a.m.

This thread started as "I'm not sure if this is offtopic" and it ended as such a great and fun to read discussion. Thank you all for these great script examples. I really enjoyed reading it.

On 2017-Oct-25 17:10, m.roth@5-cent.us wrote:

...

Tony Mountifield wrote:

...
In article b5215baacd93a6e85efc59947f9b8ed9.squirrel@host290.hostmonster.com, m.roth@5-cent.us wrote:

...
Warren Young wrote:

...
On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net

wrote:

...
...
I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address.

This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5

is

...
definitely out, as that ships Bash 3, which lacks this feature.

<snip> Associative arrays?

Awk! Awk! (No, I am not a seagull...)

sort file | awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}'

Why the sort? It doesn't matter in what order the lines are read. Wouldn't this give you the same?

awk '{ array[$1] += $2;} END { for (i in array) { print i "\t" array[i];}}' <file

You're right, not really necessary in this case. I was working with a couple of awk scripts here at work, and it was needed in the middle....
 mark
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

Mark Haney

25 Oct 25 Oct

5:28 p.m.

On 10/25/2017 12:47 PM, Warren Young wrote:

...

You’re making things hard on yourself by insisting on Bash, by the way. This solution is better expressed in Perl, Python, Ruby, Lua, JavaScript…probably dozens of languages.

Yeah, you're right, I am. An associative array was the first thing I thought of, then realized BASH doesn't do those. I honestly expected there to be a fairly straight forward way to do it in BASH, but I was sadly mistaken. In my defense, I gave virtually no thought on the logic of what I was trying to do until after I'd committed significant time to a BASH script. (Well maybe that's not a defense, but an indictment.)

As I said, I don't do much scripting anymore as the majority of my time is spent DB tuning and Ansible automation. Not really an excuse, and I appreciate your indulgence(s) in giving me a hand. As embarrassed as I am, I'll just go sit in the corner the rest of the day.

Thanks again.

-- Mark Haney Network Engineer at NeoNova 919-460-3330 option 1 mark.haney@neonova.net www.neonova.net

Warren Young

7:34 p.m.

On Oct 25, 2017, at 11:28 AM, Mark Haney mark.haney@neonova.net wrote:

...

An associative array was the first thing I thought of, then realized BASH doesn't do those.

But it does: in Bash 4, only.

If you mean you must still use Bash 3 in places, then yeah, you’ve got a problem… one probably best solved by switching to some other language once the program grows beyond Bash 3’s natural scope.

I was trying to think of which languages I know well which require even more difficult solutions than the Bash 4 one. It’s a pretty short list: assembly, C, and MS-DOS batch files. By “C” I’m including anything of its era and outlook: Pascal, Fortran…

I think even Tcl beats Bash 4 on this score, and it’s notoriously minimal in its feature set.

Here’s a brain-bender: You could probably do it with sqlite3 with fewer lines of code than my Bash 4 offering. :)

...

I honestly expected there to be a fairly straight forward way to do it in BASH, but I was sadly mistaken.

Oh, I don’t know, there must be a way to do it without associative arrays, but you’d only get points for the masochism value in doing without.

Bowie Bailey

7:45 p.m.

On 10/25/2017 3:34 PM, Warren Young wrote:

...

On Oct 25, 2017, at 11:28 AM, Mark Haney mark.haney@neonova.net wrote:

...
An associative array was the first thing I thought of, then realized BASH doesn't do those.

But it does: in Bash 4, only.

If you mean you must still use Bash 3 in places, then yeah, you’ve got a problem… one probably best solved by switching to some other language once the program grows beyond Bash 3’s natural scope.

I was trying to think of which languages I know well which require even more difficult solutions than the Bash 4 one. It’s a pretty short list: assembly, C, and MS-DOS batch files. By “C” I’m including anything of its era and outlook: Pascal, Fortran…

I think even Tcl beats Bash 4 on this score, and it’s notoriously minimal in its feature set.

Here’s a brain-bender: You could probably do it with sqlite3 with fewer lines of code than my Bash 4 offering. :)

...
I honestly expected there to be a fairly straight forward way to do it in BASH, but I was sadly mistaken.

Oh, I don’t know, there must be a way to do it without associative arrays, but you’d only get points for the masochism value in doing without.

Array N holds the names and array T holds the totals. For each line in the file, you iterate through N to find the name and then add the number to the same index in T (or create a new entry in both arrays if you don't find it). Then you just have to iterate through both arrays and print off the names from N and the totals from T. It's a pain, but it's doable.

Sorry, I'm too lazy to write code for this... :)

-- Bowie

Chris Adams

7:55 p.m.

Once upon a time, Warren Young warren@etr-usa.com said:

...

I was trying to think of which languages I know well which require even more difficult solutions than the Bash 4 one. It’s a pretty short list: assembly, C, and MS-DOS batch files. By “C” I’m including anything of its era and outlook: Pascal, Fortran…

Heh, even C on SVR4 and newer (including POSIX from 2001) have pretty straight-forward hash routines: hcreate(), hsearch(), and hdestroy().

-- Chris Adams linux@cmadams.net

Jon LaBadie

7:58 p.m.

On Wed, Oct 25, 2017 at 10:47:12AM -0600, Warren Young wrote:

...

On Oct 25, 2017, at 10:02 AM, Mark Haney mark.haney@neonova.net wrote:

...
I have a file with two columns 'email' and 'total' like this:

me@example.com 20 me@example.com 40 you@domain.com 100 you@domain.com 30

I need to get the total number of messages for each email address.

This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.

#!/bin/bash declare -A totals

while read line do IFS="\t " read -r -a elems <<< "$line" email=${elems[0]} subtotal=${elems[1]}
declare -i n=${totals[$email]}
n=n+$subtotal
totals[$email]=$n
done < stats

for k in "${!totals[@]}" do printf "%6d %s\n" ${totals[$k]} $k done

A slightly different approach written for ksh but seems to also work with bash 4.

typeset -A arr

while read addr cnt do arr[$addr]=$(( ${arr[$addr]:-0} + cnt)) done < ${1}

for a in ${!arr[*]} do printf "%6d %s\n" ${arr[$a]} $a done

Jon

-- Jon H. LaBadie jon@jgcomp.com 11226 South Shore Rd. (703) 787-0688 (H) Reston, VA 20190 (703) 935-6720 (C)

Andrew Holway

26 Oct 26 Oct

8:39 a.m.

...

You’re making things hard on yourself by insisting on Bash, by the way.

I'd always assumed that shell scripting was a kind of sado masochistic medium allowing people who don't get out much to inflict horrible torture on each other. It certainly causes me great pain every time I try and read a bash script with more than a couple of clauses.

I'm just taking over a bunch of bash CI plumbing that seems to have been written by a committee of Manson family members.

Gordon Messmer

4:37 p.m.

On Wed, Oct 25, 2017 at 9:47 AM, Warren Young warren@etr-usa.com wrote:

...

This screams out for associative arrays. (Also called hashes, dictionaries, maps, etc.)

That does limit you to CentOS 7+, or maybe 6+, as I recall. CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.

Nonsense. Every POSIX shell has an associative array called "the filesystem."

(hash=$(mktemp -d); while read addr msgs; do echo $msgs >> "$hash/$addr"; done; cd "$hash"; for x in *; do echo "$x $(paste -s -d+ < $x | bc)"; done;) < msg-counts

Warren Young

27 Oct 27 Oct

10:45 a.m.

On Oct 26, 2017, at 10:37 AM, Gordon Messmer gordon.messmer@gmail.com wrote:

...

On Wed, Oct 25, 2017 at 9:47 AM, Warren Young warren@etr-usa.com wrote:

...
CentOS 5 is definitely out, as that ships Bash 3, which lacks this feature.

Nonsense. Every POSIX shell has an associative array called "the filesystem.”

Ah, *there’s* our masochist. I knew we had at least one around here somewhere. :)

Takes one to know one, I suppose: not long ago, I proposed using the filesystem to implement a [DAG][1] in shell. ;)

[1]: https://en.wikipedia.org/wiki/Directed_acyclic_graph

Walter H.

11:17 a.m.

On 25.10.2017 18:47, Warren Young wrote:

...

You’re making things hard on yourself by insisting on Bash, by the way. This solution is better expressed in Perl, Python, Ruby, Lua, JavaScript…probably dozens of languages.

or just awk ...

2550

Age (days ago)

2552

Last active (days ago)

discuss@lists.centos.org

31 comments

15 participants

tags (0)

participants (15)

Andrew Holway
Bowie Bailey
Chris Adams
Gordon Messmer
Jason Welsh
Jon LaBadie
Leroy Tennison
m.roth＠5-cent.us
Mark Haney
Pete Biggs
Proxy
Robert Arkiletian
tony＠softins.co.uk
Walter H.
Warren Young