0

I am aware that HTML Should Not Be Parsed With RegEx.

Nonetheless, there are exceptions. I am trying to take the following text:

br > Email: fooBar@yahoo.com<br>Name: Barbara Foo<br> Phone: 888888888<br>Professio

and regEx out

Barbara Foo from it.

The RegEx I am trying to use is Name: (.*)((?=\\r)|(?=<br>)). Note that I am trying to stop capturing at <br> with (?=<br>). This does NOT work. It works perfectly well with any word, phrases, etc, but not <br>.

How do I get this to work?

RegEx tester for C# here matches what I see when running locally: http://regexstorm.net/tester

According to my research, angle brackets are not special characters to RegEx.

VSO
  • 11,546
  • 25
  • 99
  • 187
  • 1
    Add a `?` to make it not greedy: `Name: (.*?)((?=\\r)|(?=
    ))`
    – kishkin Mar 31 '20 at 15:51
  • 2
    Why not make it non greedy `.*?` and shorted it `Name: (.*?)(?:\\r|
    )` You could turn the lookahead in a match if you want the group value.
    – The fourth bird Mar 31 '20 at 15:51
  • 1
    Thanks guys, both of those work and solve my problem. I work with RegEx once a year or so, so forgot about it. Still curious why it works for phrases without angle brackets though. Edit: I think that not only solves my immediate problem but also protects me from stuff I didn't anticipate, so double thanks. – VSO Mar 31 '20 at 15:54
  • 1
    @Thefourthbird care to post an answer? Your's shorter – kishkin Mar 31 '20 at 15:59
  • 1
    @kishkin That is not needed, it is a [common issue](https://stackoverflow.com/questions/2503413/regular-expression-to-stop-at-first-match) and might qualify as a duplicate. – The fourth bird Mar 31 '20 at 16:01

0 Answers0