[CentOS] Inquiry:How to compare two files but not in line-by-line basis?

Mon Dec 7 16:16:44 UTC 2009

m.roth at 5-cent.us wrote:
>> mark wrote:
>>> Les Mikesell wrote:
>>>> Awk is just too weird for normal people.  I wouldn't even suggest
>>>> reading that manual.  If you can't do what you want with regexps and a
>>>> pipeline of simpler programs, you might as well use perl.
>>> <Looks around, yeah, this *is* a list for sysadmins of Linux....>
> Reading the response, I realize you were serious, not being funny, as I
> thought.
> 

Yes, I'm serious that if you don't already know awk, there is little to 
be gained from looking at it now.  Perl can do everything awk can do and 
more, while shell scripts can do the simpler things.

>> Who have probably almost all started something in awk and ended up either
>> needing a pipeline of other programs or switching to perl.  If your
>> machine is powerful enough to run perl (and I can't imagine one that
>> isn't in this century) you might as well use it because it does anything
>> awk can do and more.
> 
> I started seeing references to perl in the early nineties, so it ran on
> those machines. Also, I remember running into Larry Wall, and responding
> to him very irritatedly, around '93 or '94, when he showed up on
> comp.language.awk, and told someone the answer to his question was to go
> to perl. Now, I really like perl, but for some things - like were I want
> to do nothing but process one or maybe two text files at a time, and want
> to loop through the whole thing, it's simpler.

No, it is just different.  If you want perl to loop, it can. Try the a2p 
translator.

>> awk is almost as complicated to learn but can't do as much and is harder
> 
> "Almost as complicated to learn"? I had no trouble learning it around, oh,
> '91. But then, at that point I'd been programming professionally for more
> than 10 years. If you know perl, and you can program shell, and if you
> know any other language (unless *all* you know is Objectionably Oriented
> languages), there's minimal ramp-up time.

If you know perl, there's no point in downgrading to awk.  If you don't 
know either, you will find awk to be weird and unlike anything else. 
Back when it was the only way to do math in a shell script it might have 
been worth the trouble.

> awk standardized pretty much, according to what I've read - possibly man
> pages on Sun 3's or Irix - around '83. perl was *NOT* part of std. distros
> until the end of the nineties. And they do a lot of the same thing. To
> some degree, it's a matter of preferences, and to put down awk as "almost
> as complicated as perl to learn" does not impress me.

OK, I'll revise that and say it is much, much harder to use awk to 
accomplish tasks in general than it is with perl.   First there is the 
problem of the things that awk just can't do at all - like inputting 
data from places other than stdin or files, so you'll end up embedding 
awk in a shell script with other tools doing the heavy lifting, and 
probably having to arrange shell variable expansion into the awk script. 
  Then there is the real advantage of perl over almost every other 
language, which is that anything you are likely to want to do will 
already have been done and is available as a module on CPAN - so you 
will probably only have to write half a page or so yourself even for 
large jobs and things that get data from sockets, databases or URLs.

> <snip>
>> Shell commands are just what you'd type so you have to know it anyway so
>> there is nothing special about making a program out of it. Other than
>> grep using regexps the man pages for those programs are probably
> 
> And regexes have always been considered a black art - there's always the
> "how many escapes do you need for this", esp. if it's in a script.

You can't get too far without understanding shell parsing even if you 
just type stuff on the command line.  But, regexps within a perl script 
don't have to deal with this at all.

>> literally a page. No one is going to understand awk or perl after reading
>> a page. Personally I'd probably
> 
> So, you don't actually know any programming, and it sounds like you want
> to learn as little as possible, even though doing so will make your life
> easier upstream.
> <snip>
> Try it - you might find that to be the case.

Wrong conclusion.  I've started a lot of things in shell and awk and hit 
dead ends when I needed functionality that they couldn't handle - and 
ended up starting over in perl.  Now I would only start in shell if I 
know the simpler utilities can handle the whole job (which, as data is 
increasingly handled in databases and xml over networks, is increasingly 
rare).

> Oh, and if you're on this list, then the mundane world doesn't consider
> you "normal", anyway; you're a geek, or a wonk, or a
> fill-in-the-stereotype-put-down-name, not a "k3wl dud3".

Agreed, but for this group, understanding regexps and the shell is 
fairly essential and needing perl's full functionality is probably 
common, where awk is just a historical oddity.  It still works for its 
old tasks, but it's not up to the ways data is currently handled and is 
likely to be a waste of time to consider if you don't already understand 
its internal parser.

-- 
   Les Mikesell
    lesmikesell at gmail.com