0

This is the test case:

<td>ARMOIRE 6 PORTES &lt;span style=&quot;font-family: arial;&quot;&gt;&lt;u style=&quot;color: rgb(170, 170, 170);&quot;&gt;:</td><td>Longueur : 297 cm - Profondeur : 73 cm - Hauteur : 260 cm.&lt;/span&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;&lt;u&gt;ARMOIRE 4 PORTES &lt;span style=&quot;font-family: arial;&quot;&gt;&lt;u style=&quot;color: rgb(170, 170, 170);&quot;&gt;:</td></tr><tr><td>Chevet

And here's my solution: (?<=&)(?!.<).?(?=[A-Z])

Basically I want to select everything between a & and the first instance of [A-Z], but not if it contains a html tag bracket. At first I thought it doesn't work, because it didn't - not in Notepad++, not on regex101.com

However, if I modify the test string

<td>ARMOIRE 6 PORTES &lt;span style=&quot;font-family: arial;&quot;&gt;&lt;u style=&quot;color: rgb(170, 170, 170);&quot;&gt;:</td><td>Longueur : 297 cm - Profondeur : 73 cm - Hauteur : 260 cm.&lt;/span&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;&lt;u&gt;ARMOIRE 
4 PORTES &lt;span style=&quot;font-family: arial;&quot;&gt;&lt;u style=&quot;color: rgb(170, 170, 170);&quot;&gt;:</td></tr><tr><td>Chevet

then it works. test on regex101

I would like to understand why and how I can modify my regex to cover both cases - if possible.

ina
  • 3
  • 1
  • 2
  • Please add a problem statement to your question. What is the output you expect from that input HTML string, and why? – Tim Biegeleisen Jun 22 '21 at 04:49
  • Use `(?<=&)(?:(?!<).)*?(?=[A-Z])`. Or better (works because you're rejecting a character not a string): `(?<=&)[^>]*?(?=[A-Z])`. – 41686d6564 stands w. Palestine Jun 22 '21 at 04:50
  • Duplicate of: [Regular expression to match a line that doesn't contain a word](https://stackoverflow.com/questions/406230/regular-expression-to-match-a-line-that-doesnt-contain-a-word), [Regex: match everything but specific pattern](https://stackoverflow.com/q/1687620/8967612) – 41686d6564 stands w. Palestine Jun 22 '21 at 04:51
  • What do you mean by "elect everything between a & and the first instance of [A-Z]" ? Could you kindly give examples of your desired result? – h-sifat Jun 22 '21 at 04:55
  • @h-sifat I believe they mean everything between a `&` character and any upper case English letter. – 41686d6564 stands w. Palestine Jun 22 '21 at 04:56
  • @41686d6564 Thank you. If you add this as an answer I'll mark it as accepted. – ina Jun 22 '21 at 08:37
  • @ina No need when it's already answered in the duplicates. Just choose "Yes" as your answer to the "does this answer your question" notice on top of your question to have it linked to the dup. – 41686d6564 stands w. Palestine Jun 22 '21 at 08:39
  • @Tim Biegeleisen I'll try to explain my problem better next time. What I wanted was to match any string that starts with an & and up to the first uppercase latin alphabet letter, but not if this string contains a html tag bracket < . – ina Jun 22 '21 at 08:44

0 Answers0