2

I have the following string:

Lorem ipsum {{ var1 }} dolar epsum <a href="{{ var2 }}">solar</a> hello goodbye. {{var3}} var {{ var4 }}

How do I capture all occurances of {{ var }} but not those between double quotes "{{ var2 }}"?

I have tried (?!")({{.*?}}) but it still captures all instances of {{ var }}.

Example here: https://regex101.com/r/zfjSI7/1

alias51
  • 8,178
  • 22
  • 94
  • 166
  • This is really hard to do with regular expressions. If you have multiple sets of quotes, then everything is between two of them except the parts before the first quote and after the last quote. – Barmar Sep 28 '22 at 22:21
  • 3
    The first one should be a lookbehind, and then append a negative lookahead at the end `(?<!"){{.*?}}(?!")` – The fourth bird Sep 28 '22 at 22:22
  • What about negative on `href="{{ var }}"` somehow? All the vars I want to exclude are inside html anchors – alias51 Sep 28 '22 at 22:22

1 Answers1

3

As mentioned you need to use a lookbehind, not a lookahead. Further if there are no { } inside, use of a negated class will be more efficient than a lazy dot and stay inside: (?<!"){{[^}{]+}}


If you're on PHP/PCRE/Python with PyPI regex you can skip stuff by using verbs (*SKIP)(*F).

Skip quoted parts:
"[^"]*"(*SKIP)(*F)|{{[^}{]+}}
Or skip anchors:
<a\b[^><]*>(*SKIP)(*F)|{{[^}{]+}}


Another idea is to check for being outside angle brackets by use of a lookahead.
{{[^}{]+}}(?![^><]*>)

bobble bubble
  • 16,888
  • 3
  • 27
  • 46