0

I have a regex that picks up everything between curly brackets (inclusive)

Regex101 example here: https://regex101.com/r/isEbes/1

(?={{)(.*?)}}

Nemo enim ipsam voluptatem quia {{ voluptas }}  labore et {{ dolore }} eum 
<a href="{{iure}}">reprehenderit</a> qui <a href="{{ news }}">{{smart}}</a> ea."

This returns {{voluptas}} {{dolore}} {{iure}} {{news}} and {{smart}}

However, I want to exclude everything in href="", so in this case {{iure}} and {{news}} should not be included.

How can I do this? E.g. I've tried [^"](?={{)(.*?)}}[^"] but this still captures href="{{ item }} something elese" https://regex101.com/r/xVDesH/1

alias51
  • 8,178
  • 22
  • 94
  • 166
  • What have you tried to meet this new requirement? Where are you getting stuck? – esqew Oct 09 '21 at 18:02
  • Yes, for example: `((?!.*href="{{.*).)(?!href)(?={{)(.*?)}}` but this only excludes the first result – alias51 Oct 09 '21 at 18:05
  • I have tried `[^"](?={{)(.*?)}}[^"]` an it seems to match, does it ? Or something is missing in your question :) – Philippe Oct 09 '21 at 18:05
  • @Philippe that matches the character after too: https://regex101.com/r/Zryy9l/1 – alias51 Oct 09 '21 at 18:06
  • Also [^"](?={{)(.*?)}}[^"] doesn't exclude results if not immediately bound by quotes https://regex101.com/r/9n25yf/1 – alias51 Oct 09 '21 at 18:07
  • Oops, sorry for that :D – Philippe Oct 09 '21 at 18:08
  • 1
    The next step is you'll want to handle nested quotes or escaped quotes or nested brackets or ... Use a parser instead, regex is painful and ultimately unable to handle complex variations which are natural and obvious with a different frame of mind. – tripleee Oct 09 '21 at 18:09
  • @tripleee, thanks, there are no nested quotes. It's just excluding anything in an `href=` – alias51 Oct 09 '21 at 18:10

1 Answers1

1

Well, you could use a negative lookbehind, for example

const text = `
Nemo enim ipsam voluptatem quia {{ voluptas }}  labore et {{ dolore }} eum 
<a href="{{iure}}">reprehenderit</a> qui <a href="{{ news }}">{{smart}}</a> ea."
`;

console.log( text.match(/(?<!href="[^"]*){{.*?}}/g) );

If the language you are using does not support variable-width lookbehind then just (?<!href=") would work for your example.

Using regex for this kind of thing is not fool-proof, but it may be good enough.

MikeM
  • 13,156
  • 2
  • 34
  • 47