1

I have a regex that I want to match a certain pattern. However, I don't want it to match that pattern if it exists between HTML comment blocks

What I have currently is:

(?<!<!--)pattern(?!-->)

However that only works when the pattern is exactly between comment blocks but not in the case of something like:

<!-- foo pattern -->

But if I do:

(?<!<!--.*)pattern(?!-->)

then this case doesn't work:

<!-- some commented out stuff --> pattern

I think if I could express (everything except -->)*? within the negative look behind it would work but I'm unsure of the proper syntax or if that's allowed.

  • `(?<!)` might work then. What you ask for is ``(?<!(?:(?!).)*)pattern(?!(?:(?!).)*-->)`` – Wiktor Stribiżew Jun 14 '19 at 21:32
  • 1
    Use a real HTML parser. [See also](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags). – melpomene Jun 14 '19 at 21:52
  • When you say `between comment blocks`, if you have 400 comment blocks or 2, all it takes is to have a comment block at the top, and one at the bottom. –  Jun 14 '19 at 22:04
  • No matter how you try, there is only one way to do this, replace the first and last comment, and everything between with nothing, then try a new regex with your pattern on what's left. –  Jun 14 '19 at 22:07
  • 1
    You can't just say `(?:.*)?.*?pattern` it finds pattern no matter what. You could match both, to move past comments like this `(?:.*)|(pattern)` then see if group 1 matched. It's the only way. –  Jun 14 '19 at 22:12

1 Answers1

0

My guess is that, your original expression is just fine with maybe a bit modification, we might want to have an expression similar to:

(?<=<!--).*pattern.*(?=-->)

Demo

and if we wish to capture or not-capture anything around pattern these might be of interest:

(?<=<!--).*(pattern).*(?=-->)
(?<=<!--)(.*pattern.*)(?=-->)
(?<=<!--)(.*)(pattern)(.*)(?=-->)
(?<=<!--)(?:.*)(pattern)(?:.*)(?=-->)
Emma
  • 27,428
  • 11
  • 44
  • 69