-4

I am looking to scan through some HTML and look for specific text (lets say "foo") but if it is in a comment I don't want to include it

So say I had some html like this:

<div id="foo"> Some foo here
<!-- this is a foo comment

-->
And finally some more foo

It would find all the foo's EXCEPT the foo comment

I've played with negative lookaheads but for the life of me cannot get it to work...

Any regex guru's out there?

I know some folks will suggest using a HTML parser but I want to stay away from that...

Thanks in advance...

nyrsimon
  • 3
  • 2
  • 6
    I recommend reading [Why you shouldn't parse HTML with regexp](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) – Krease Oct 06 '14 at 01:16
  • 3
    On a side note, why do you want to stay away from an HTML parser? It's the right tool for the job. – Krease Oct 06 '14 at 01:25
  • Basically because I'm not trying to parse HTML, I just want to identify if a specific string exists, not inside a comment. I don't care about the rest of the html, if it's formed correctly or it's contents. Seems like a perfect use of regex in one line of code - no? The fact that it is HTML is almost inconsequential. If I change the question to can I identify if foo is not between quotes does that make sense? – nyrsimon Oct 06 '14 at 03:11
  • Definitely would be much more feasible question :) Be sure to attempt it yourself first, and if you can't solve it, include that information to demonstrate what you've tried and where you ran into problems. – Krease Oct 06 '14 at 04:16

1 Answers1

0

Regex is not recommended for this but if you want:

foo(?![^>]*-->)

Demo

walid toumi
  • 2,172
  • 1
  • 13
  • 10