1

In a POSIX shell script, I need to find all occurrences of text enclosed within {{ and }} and replace the text along with the surrounding braces with an asterisk.

For example, if the input is

{{ abc }} def {{ ghi {jkl} mno }} pqr

then the output must be

* def * pqr

I have not been able to come up with a sed command for this that works.

I tried a couple of commands but they don't work. For example, the following command does not produce the desired output because sed does greedy matching. It ends up matching {{ abc }} def {{ ghi {jkl} mno }} as the first match instead of just {{ abc }}.

$ echo "{{ abc }} def {{ ghi {jkl} mno }} pqr" | sed 's/{{.*}}/*/g'
* pqr

Here is another example that does not work because it ends up matching too little. It does not match {{ ghi {jkl} mno }} (which we want to match) because this part of the string contains } within it.

$ echo "{{ abc }} def {{ ghi {jkl} mno }} pqr" | sed 's/{{[^}]*}}/*/g'
* def {{ ghi {jkl} mno }} pqr

How else can I do such a match?

I have gone through Non greedy regex matching in sed? but the solutions there don't help because here I want to match everything between {{ and }} except a specific sequence of two consecutive characters, i.e. }}. If I were trying match everything between the delimiters except a single characters, the answers to that question would have helped.

Community
  • 1
  • 1
Lone Learner
  • 18,088
  • 20
  • 102
  • 200

1 Answers1

0

If you had a regular expression that matched something that didn't contain "}}" then you could use it as "{{" exp "}}". Unfortunately sed doesn't have a complement regexp operator. Many regexp implementations do, since the complement of a regular language is regular. So we do know it exists, but we just have to construct it manually.

In a more readable format than sed, something close is "{{" ( [^}]* ( "}" [^}] )? )* "}}".

In proper sed that is:

$ echo "{{ abc }} def {{ ghi {jkl} mno }} pqr" \
    | sed 's/{{\([^}]*\(}[^}]\)\?\)*}}/*/g'
* def * pqr
$

This may not be exactly what you want depending on whether or not you are expecting three braces in a row. What should happen with this abc {{ def { ghi }}}? If you actually need to balance braces, this takes it out of the realm of regular languages and into context free languages which will require more powerful tools.

Given your user name you might want to read a book on formal languages and automata theory. It may be "old" technology, but it is very powerful and used all day long by all kinds of technology.

JimD.
  • 2,323
  • 1
  • 13
  • 19