[CentOS] OT: Script Help

Sun May 19 13:03:47 UTC 2013
Larry Martell <larry.martell at gmail.com>

On Sat, May 18, 2013 at 6:31 PM, James Pifer <jep at obrien-pifer.com> wrote:
> On 5/18/2013 3:23 PM, Larry Martell wrote:
>> On Sat, May 18, 2013 at 1:15 PM, James Pifer <jep at obrien-pifer.com> wrote:
>>> Sorry for the off topic, but don't a better resource. I'm not great at
>>> scripting, but need a quick script to modify a file.
>>>
>>> I have a long file that has lines like this:
>>>
>>> some text
>>> some text2
>>> CN=DATA.OU=XYZ.O=CO
>>> some text3
>>> some text4
>>>
>>> And this repeats, but XYZ changes. "DATA" is always called data. (it's
>>> being renamed basically)
>>>
>>> I need to change the middle line but leave the rest of the file as is
>>> like this:
>>>
>>> some text
>>> some text2
>>> CN=XYZ_DATA.OU=XYZ.O=CO
>>> some text3
>>> some text4
>>>
>>> Anyone know a quick way to do this? Any help is appreciated.
>> cat file | sed -e's/CN=DATA.OU=\(.*\)\.O=CO/CN=\1_DATA.OU=\1.O=CO/'
> Larry,
>
> Thanks for the answer. Still having trouble making it work. Been looking
> at sed for the last two hours. Let me give a specific example of a few
> lines I would want to change:
>
> Let's say my original lines are:
> CN=DATA.OU=XYZ.O=CO
> CN=DATA.OU=XYY.OU=MEM.O=CO
> CN=DATA.OU=XZZ.OU=OOP.O=CO
>
> I want them to look like:
> CN=XYZ_DATA.OU=XYZ.O=CO
> CN=XYY_DATA.OU=XYY.OU=MEM.O=CO
> CN=XZZ_DATA.OU=XZZ.OU=OOP.O=CO
>
> So I need to take the data after the FIRST OU and stick in front of DATA
> with an _ in between. The rest of the line then remains the same.
>
> Hope it makes sense. Appreciate the help!

sed only does greedy matching, so you'll have to move to a more modern
tool. I'd do this in python. Something like this:

import re, sys

pattern  = re.compile('^(CN=)(DATA\.OU)(.*?)(\..*$)')

for path in sys.argv:
    with open(path, 'r') as fh:
        for line in fh:
            line = line.strip()
            match = pattern.match(line)
            if match:
                print
match.group(1)+match.group(3)+'_'+match.group(2)+match.group(3)+match.group(4)
            else:
                print line

When I run that with your input I get your desired output.