I need to review a logfile with Sed and cut out all the lines that start with a certain word, problem is this word begins after some amount of whitespace and unless I search for whitespace at the beginning followed by "word" I may encounter "word" somewhere legitimately hence why I don't just search for "word" only...
Anyone know how to make sed accomplish this?
Thanks! jlc
On Mon, Jan 05, 2009, Joseph L. Casale wrote:
I need to review a logfile with Sed and cut out all the lines that start with a certain word, problem is this word begins after some amount of whitespace and unless I search for whitespace at the beginning followed by "word" I may encounter "word" somewhere legitimately hence why I don't just search for "word" only...
Anyone know how to make sed accomplish this?
There's always more than one way to do something like this:
sed -n '/^[ \t]*word\s/p' /var/log/messages
pcregrep '^\s*word\b' /var/log/messages
awk '$1 == "word"{print}' /var/log/messages
Bill
awk '$1 == "word"{print}' /var/log/messages
This example assumes that word is the first field and that it consists only of "word". If the first field is "word1" this won't match.
Fixes for this are
awk '$1 ~ "word"{print}'
(this matches any occurrance of "word" in the first field)
or:
awk '/^[[:space:]]*word/ {print}'
(this matches any line starting with whitespace followed immediately by "word")
On Mon, 5 Jan 2009, Joseph L. Casale wrote:
I need to review a logfile with Sed and cut out all the lines that start with a certain word, problem is this word begins after some amount of whitespace and unless I search for whitespace at the beginning followed by "word" I may encounter "word" somewhere legitimately hence why I don't just search for "word" only...
The regex you want is "^[[:space:]]*word"
The regex you want is "^[[:space:]]*word"
Wow, thanks everyone for the help! How does one modify this to also knock out lines that *must* have whitespace followed by a number [0-9]? I can do it using "^[[:space:]]*[0-9]" but it also takes out lines w/o whitespace that begin with numbers?
I have to buy a book on RegEx's and Sed :)
Thanks all! jlc
[0-9]? I can do it using "^[[:space:]]*[0-9]" but it also takes out lines w/o whitespace that begin with numbers?
to match one or more, use + instead of *.
* matches 0 or more, + matches 1 or more.
I have to buy a book on RegEx's and Sed :)
http://www.gnu.org/manual/gawk/gawk.pdf
(G)awk is pretty sh!t hot where I work; however we've extended it a bit. :)
to match one or more, use + instead of *.
- matches 0 or more, + matches 1 or more.
Thanks!
I have to buy a book on RegEx's and Sed :)
http://www.gnu.org/manual/gawk/gawk.pdf
(G)awk is pretty sh!t hot where I work; however we've extended it a bit. :)
So gawk does all that sed does and more? I suppose I can start with that in this case, I always wanted a book on regexe's so I think I am going to order O'Reilly's Mastering Regular Expressions, Third Edition. They also have a sed & awk, Second Edition book, but its 10+ years old, does that matter, has sed/awk changed any since then?
Thanks everyone! jlc
So gawk does all that sed does and more? I suppose I can start with
Can't really answer that. In 15 years of using UNIX systems, I've never touched sed. :)
With Gawk's BEGIN and END blocks you can use it to write full programs, which is kind of nice.
that in this case, I always wanted a book on regexe's so I think I am going to order O'Reilly's Mastering Regular Expressions, Third Edition. They also have a sed & awk, Second Edition book, but its 10+ years old, does that matter, has sed/awk changed any since then?
The link I sent you is the 3rd edition of that book. Dated 2004. The book (Effective AWK Programming) is available completely free, but is also available in dead-tree editions. I printed and bound my PDF and saved a few dollars.
On Mon, 2009-01-05 at 13:40 -0700, Joseph L. Casale wrote:
to match one or more, use + instead of *.
- matches 0 or more, + matches 1 or more.
Thanks!
<snip>
So gawk does all that sed does and more? I suppose I can start with
Tons. You can write fairly complex programs with (g)awk. It can combine command line expressions, scripts from files, has formatted print capability, conditional execution, multiple regex selection capabilities and mode.
A read of the man page would give you a lot of insight. Think of perl in an earlier form. The original awk was probably what inspired perl. That would be my guess.
Since (g)awk is regex based, what you learn for sed, vi(m), etc. is easily transferred into (g)awk, and vice-versa, to a limited degree.
that in this case, I always wanted a book on regexe's so I think I am going to order O'Reilly's Mastering Regular Expressions, Third Edition. They also have a sed & awk, Second Edition book, but its 10+ years old, does that matter, has sed/awk changed any since then?
The man pages will allow you to keep up easily once the fundamentals are in place. Of course, frequency of use affects that greatly.
Thanks everyone! jlc
<snip>
Joseph L. Casale wrote:
to match one or more, use + instead of *.
- matches 0 or more, + matches 1 or more.
Thanks!
I have to buy a book on RegEx's and Sed :)
http://www.gnu.org/manual/gawk/gawk.pdf
(G)awk is pretty sh!t hot where I work; however we've extended it a bit. :)
So gawk does all that sed does and more? I suppose I can start with that in this case, I always wanted a book on regexe's so I think I am going to order O'Reilly's Mastering Regular Expressions, Third Edition. They also have a sed & awk, Second Edition book, but its 10+ years old, does that matter, has sed/awk changed any since then?
Why not just start with perl which does more than sed/awk while using similar syntax (if you want)?
Why not just start with perl which does more than sed/awk while using similar syntax (if you want)?
This is why:
awk '/^[[:space:]]*word/ {print}' logfile
vs
perl -ne 'if (/^\s*word/) { print $_; }' logfile
Which syntax is likely to be easier to remember?
Spiro Harvey wrote:
Why not just start with perl which does more than sed/awk while using similar syntax (if you want)?
This is why:
awk '/^[[:space:]]*word/ {print}' logfile
vs
perl -ne 'if (/^\s*word/) { print $_; }' logfile
Which syntax is likely to be easier to remember?
I never remember the awk syntax because if it is really that simple I'd use grep with it's implied print. But it's almost never really that simple and you end up needing things that are difficult in awk but easy in perl. Perl can use the posix names for character classes too if you like to type and how can you forget the 'if (expresssion) {action}; syntax? Also you could have omitted the $_ argument to print, since it is assumed if you are looking for simplicity.
On Tue, Jan 06, 2009, Spiro Harvey wrote:
Why not just start with perl which does more than sed/awk while using similar syntax (if you want)?
This is why:
awk '/^[[:space:]]*word/ {print}' logfile
vs
perl -ne 'if (/^\s*word/) { print $_; }' logfile
Which syntax is likely to be easier to remember?
It depends entirely on what you want to do. For on-liners, sed, awk, and grep, and pcregrep (grep using perl regular expression syntax which is considerably more concise than [:space:] and friends) are often the best tools. For anything more complex, scripting languages such as python and perl are generally more flexible and easier to use.
I used to some pretty complex shell and awk scripts before learning perl about 20 years ago. Perl allowed me to do most things in a single language including fairly low-level system calls that I previously had to do with compiled ``C'' programs.
I have switched all of my new development primarily to python which I find far cleaner than perl, and easier to use for large projects. Python uses perl regular expression syntax so the transition was pretty painless.
Bill
Bill Campbell wrote:
I used to some pretty complex shell and awk scripts before learning perl about 20 years ago. Perl allowed me to do most things in a single language including fairly low-level system calls that I previously had to do with compiled ``C'' programs.
And you can probably still run all of your perl scripts unchanged, with the possible exception of "@array" being interpolated in double-quoted strings which I think started in perl4.
I have switched all of my new development primarily to python which I find far cleaner than perl, and easier to use for large projects. Python uses perl regular expression syntax so the transition was pretty painless.
Don't count on the same stability with python. It has an annoying habit of changing syntax in non-backwards compatible ways with no provision for running old scripts. If you run your programs on more than one machine you'll end up having to maintain different versions to match the installed interpreters.
Les Mikesell lesmikesell@gmail.com wrote:
Don't count on the same stability with python. It has an annoying habit of changing syntax in non-backwards compatible ways with no
You seem to be hell-bent (excuse the pun) on turning this into a jihad on scripting languages. Please take the credo of your own favoured religion, sorry, language into account: There's more than one way to do it.
Cope.
Spiro Harvey wrote:
Les Mikesell lesmikesell@gmail.com wrote:
Don't count on the same stability with python. It has an annoying habit of changing syntax in non-backwards compatible ways with no
You seem to be hell-bent (excuse the pun) on turning this into a jihad on scripting languages. Please take the credo of your own favoured religion, sorry, language into account: There's more than one way to do it.
Cope.
There are hard ways and easy ways. I tend to prefer the easy ways and thought others might too.
On Mon, Jan 05, 2009, Les Mikesell wrote:
Bill Campbell wrote:
I used to some pretty complex shell and awk scripts before learning perl about 20 years ago. Perl allowed me to do most things in a single language including fairly low-level system calls that I previously had to do with compiled ``C'' programs.
And you can probably still run all of your perl scripts unchanged, with the possible exception of "@array" being interpolated in double-quoted strings which I think started in perl4.
I think that was perl-5, but I may well be mistaken. I have found some changes in perl along the way that have required fixing scripts since I started in perl-3.something, but not many.
I have switched all of my new development primarily to python which I find far cleaner than perl, and easier to use for large projects. Python uses perl regular expression syntax so the transition was pretty painless.
Don't count on the same stability with python. It has an annoying habit of changing syntax in non-backwards compatible ways with no provision for running old scripts. If you run your programs on more than one machine you'll end up having to maintain different versions to match the installed interpreters.
I have not run into many (any) compatibility issues with python, but then I have only been doing python for a bit over 4 years now. As I remember, there were some issues with the early versions of python-2.4, but those were in the python builds, not in the syntax of python itself.
I tend to stay away of the more esoteric features of languages that are likely to change so don't generally have problems of this type.
We don't have problems with multiple versions of packages as we use the ones from the OpenPKG portable packaging system which includes its own versions of python, perl, gcc, berkeley db, etc. avoiding most problems with the underlying distribution/vendor's packages. There were some issues when we moved to CentOS from SuSE in that SuSE ran ran python-2.3.x while CentOS has python-2.4.x which caused some interesting shared library issues with the OpenPKG python-2.4.x (which we are running for Zope compatibility as the version of Zope we're running doesn't work with python-2.5.x.
Python-3 definately has backwards compatibility issues, and there are lengthy explanations as to why this is so.
Bill
com>
Bill Campbell wrote on Mon, 5 Jan 2009 16:02:29 -0800:
(which we are running for Zope compatibility as the version of Zope we're running doesn't work with python-2.5.x.
you did realize that this is another python compatibility issue, did you ;-)
Kai
On Tue, Jan 06, 2009, Kai Schaetzl wrote:
com>
Bill Campbell wrote on Mon, 5 Jan 2009 16:02:29 -0800:
(which we are running for Zope compatibility as the version of Zope we're running doesn't work with python-2.5.x.
you did realize that this is another python compatibility issue, did you ;-)
True enough :-).
Bill
On Jan 5, 2009, at 2:56 PM, Joseph L. Casale wrote:
The regex you want is "^[[:space:]]*word"
Wow, thanks everyone for the help! How does one modify this to also knock out lines that *must* have whitespace followed by a number [0-9]? I can do it using "^[[:space:]]*[0-9]" but it also takes out lines w/o whitespace that begin with numbers?
^[[:space:]]+[[:digit:]]+
will hit numbers with one or more digits. to restrict the number of digits, use something like
^[[:space:]]+[[:digit:]]{2}[^[:digit:]]+
that, for example, should only hit lines that consist of at least one whitespace character, then exactly two digits, then at least one non- digit character.
-steve
-- If this were played upon a stage now, I could condemn it as an improbable fiction. - Fabian, Twelfth Night, III,v
On Mon, 5 Jan 2009, Joseph L. Casale wrote:
The regex you want is "^[[:space:]]*word"
Wow, thanks everyone for the help! How does one modify this to also knock out lines that *must* have whitespace followed by a number [0-9]? I can do it using "^[[:space:]]*[0-9]" but it also takes out lines w/o whitespace that begin with numbers?
Probably something like "^[[:space:]]+[0-9]"
-- though that assumes you're using gawk (since the + modifier is GNU-specific).
For non-GNU awks, "^[[:space:]][[:space:]]*[0-9]"
I have to buy a book on RegEx's and Sed :)
Good idea!
What about:
perl -ne 'if (/^\s*word/) { print $_; }' logfile
any others?
On Mon, Jan 5, 2009 at 11:45 AM, Joseph L. Casale JCasale@activenetwerx.com wrote:
I need to review a logfile with Sed and cut out all the lines that start with a certain word, problem is this word begins after some amount of whitespace and unless I search for whitespace at the beginning followed by "word" I may encounter "word" somewhere legitimately hence why I don't just search for "word" only...
Anyone know how to make sed accomplish this?
Thanks! jlc _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos