0

I'm on Linux ( CentOS ) and I'm trying to capture from something that looks like

 This, formatting | is, 123gh234ee2, {absolutely}, [ positively | obnoxious | in ], {every}, [ {single} | {way} ],, Thanks | For your | Help!

What I want is to replace all pipes |, but only those within [ ]. So...

 This, formatting | is, 123gh234ee2, {absolutely}, [ positively ; obnoxious ; in ], {every}, [ {single} ; {way} ],, Thanks | For your | Help!

I've tried several expressions, but the one I think should work doesn't. Can anyone explain why?

sed -i 's/(?<=\[)(\|)(?=\])/;/g' 'myFile.txt'

My idea was do a look ahead for the [ with

(?<=\[)

Do a look behind with

(?=\])

And capture the pipes with

(\|)

However nothing in my file changes and I really can't seem to place my finger on what's wrong.

Thanks!

To clarify, I've also tried the perl method of

cat '/myFile.txt' | perl -ne 's/(?<=\[)(\|)(?=\])/xxxxx/g; print;'

And still do not get a changed result.

Jibril
  • 967
  • 2
  • 11
  • 29

1 Answers1

0

Your lookbehind and lookahead regexes are trying to match single characters. They will probably work, if your input text contains exactly [|].

In theory, you want your lookbehind to be something like (?<=\[.*) but the reality is that most engines don't handle arbitrary-length lookback.

You could use a sed {command ; block } to implement looping, appending various segments of the line to the internal buffer one at a time, then emitting the entire line once matching stopped.

A better idea, IMO, would be to switch to a language that would let you use the brackets to divide the text.

You could use awk, perl, or python, for example, to grab the text between [] and then process it separately. These would not be regular expressions, but small scripts.

Finally, another option might be to first replace your open brackets with a special tag, and your close brackets add a newline:

sed -e 's/\[/\n@[/g' -e 's/]/]\n/g'

This would put your bracketed text on their own lines, so you could follow that by doing a pattern-addressed linewide replace:

sed -e '/^@\[/s/\|/;/g' # On lines starting with @[ replace | with ;

Now you have to glue the lines back together, which you can find here

aghast
  • 14,785
  • 3
  • 24
  • 56