2

i'm trying to replace an exact string that includes bracket on it. let's say: a[aa] to bbb, just for giving an example.

I had used the following regex:

sed  's|\<a\[aa]\>|bbb|g' testfile

but it doesn't seem to work. this could be something really basic but I have not been able to make it work so I would appreciate any help on this.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
dperezg
  • 21
  • 1
  • You may try `sed 's|\ – Wiktor Stribiżew Feb 28 '20 at 19:13
  • sorry, my regex is wrong, i have tried something like sed -i "s|\|bb|g" testfile – dperezg Feb 28 '20 at 19:13
  • yeah, i read that ] should not be escaped, but even in that way it doesn't work – dperezg Feb 28 '20 at 19:14
  • See https://ideone.com/RQVwaK – Wiktor Stribiżew Feb 28 '20 at 19:15
  • So, is there a way of having a word boundary with this kind of strings? – dperezg Feb 28 '20 at 19:21
  • in my case, this is just part of a line, and i would like to replace only the a[aa] part, and not the whole line – dperezg Feb 28 '20 at 19:22
  • But what is your definition of the right-hand boundary? Do you mean there must be whitespace or end of string? – Wiktor Stribiżew Feb 28 '20 at 19:26
  • there must be a whitespace – dperezg Feb 28 '20 at 21:22
  • [edit] your question to include concise, testable sample input and expected output. Show the strings you want to match **in context** surrounded by similar strings that you do **not** want to match as that's the hard part of your question. See also [is-it-possible-to-escape-regex-metacharacters-reliably-with-sed](https://stackoverflow.com/questions/29613304/is-it-possible-to-escape-regex-metacharacters-reliably-with-sed) as that matter is the string you want to match can contain RE metachars or sed regexp delimiter chars. Also tell us which sed version you're using. – Ed Morton Feb 29 '20 at 15:07

1 Answers1

0

You need to remove the trailing word boundary that requires a letter, digit or _ to immediately follow the ] char.

sed 's|\<a\[aa]|bbb|g' file

See the online sed demo:

s="say: a[aa] to bbb, not ba[aa]"
sed 's|\<a\[aa]|bbb|g' <<< "$s"
# => say: bbb to bbb, not ba[aa]

You may also require a non-word char with a capturing group and replace with a backreference:

sed -E 's~([^_[:alnum:]]|^)a\[aa]([^_[:alnum:]]|$)~\1bbb\2~g' file

Here, ([^_[:alnum:]]|^) captures any non-word char or start of string into Group 1 and ([^_[:alnum:]]|$) matches and caprures into Group 2 any char other than _, digit or letter, and the \1 and \2 placeholders restore these values in the result. This, however, does not allow consecutive matches, so you may still use \< before a to play it safe: sed -E 's~\<a\[aa]([^_[:alnum:]]|$)~bbb\1~g'. file`.

See this online demo.

To enforce whitespace boundaries you may use

sed -E 's~([[:space:]]|^)a\[aa]([[:space:]]|$)~\1bbb\2~g' file

Or, in your case, just a trailing whitespace boundary seems to be enough:

sed -E 's~\<a\[aa]([[:space:]]|$)~bbb\1~g' file
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563