-2

What I want to accomplish:

I want to match certain explicit content outside of comments.

An Example:

<div>
    <div>Hello $world$</div>
    <div>Another text <!-- $example$--></div>   
</div>
<div>
    How are $you$?
</div>
<!-- 
<div>
    Lorem ipsum $dolor$ sit
</div>
-->

Words I want to match: $world$ , $you$

Words I don't want to match: $example$ , $dolor$

So far I was only able to match either all or none.

What I can't do:

I can't delete all comments because because it's required to provide the source code I filtered.

blkpingu
  • 1,556
  • 1
  • 18
  • 41
Jan Nowak
  • 119
  • 3
  • 8
  • Can you provide an example of code which is not working? – nutic May 04 '18 at 12:28
  • can you specify the Regular expression you used, but didn't provide the results you hoped for? – blkpingu May 04 '18 at 12:34
  • Finally I ended with this: `/(?:)($.*?$)/gsm` but it definitely not match what i want to match :) – Jan Nowak May 04 '18 at 12:36
  • Add alternation (`/(?:)|($.*?$)/gsm`, notice the `|` in the middle) and check if there is anything in the first capturing group – Dmitry Egorov May 04 '18 at 12:38
  • @DmitryEgorov It did the trick! Thank you. Can you please post your comment as an answer so thanks to that i'll be able to mark my question as resolved. – Jan Nowak May 04 '18 at 12:42
  • @JanNowak feel free to read up on [what to do when someone answers my question](https://stackoverflow.com/help/someone-answers). – blkpingu May 11 '18 at 17:50

2 Answers2

0

I can't tell where you come from with your code, but you need to read your page into a String oder String[] and then run a regular expression over it to extract the Strings you want to filter.

How to use a stream with regex in java:

How do I create a Stream of regex matches?

How to use regex in java:

https://www.tutorialspoint.com/java/java_regular_expressions.htm

Test your regular expression before deploying it:

https://regexr.com/

blkpingu
  • 1,556
  • 1
  • 18
  • 41
0

Add alternation

/(?:<!--.*?-->)|($.*?$)/gsm
               ^

and check if there is anything in the first capturing group.

Dmitry Egorov
  • 9,542
  • 3
  • 22
  • 40