Regex match a string that doesn't contains a string

Question

I want to replace all the <span...> (including <span id="... and <span class="...) in an html by <span> except if the span starts by <span id="textmarker (for example I don't want to keep this span : <span attr="blah" id="textmarker">)

I've tried the regex proposed here and here, I finally came up with this regex that never returns a <span id="textmarker but somehow it sometimes misses the other spans:

<span(?!.*? id="textmarker).*?">

You can see my (simplified) html here : https://regex101.com/r/yT9jG2/2

Strangely, if I run the regex in notepad++ it returns 3 matches (the three spans in the second paragraph) but regex101 only returns 1 match. Notepad++ and regex101 both miss the span in the first paragraph.

This regex also doesn't return every spans it should( cf the spans with a gray highlights here

<span(?![^>]*? id="textmarker)[^>]*?>

mistake #1: using regexes to manipulate html. you should be using a DOM parser. — Marc B, Jan 15 '16 at 17:27
Do you only want to exclude `spans` that *start* with `id=...` or do you want to also exclude `spans` where `id=...` is not the first attribute? — Brendan Abel, Jan 15 '16 at 17:29
@BrendanAbel Thanks for helping me being more precise, cf the edit in the question. — MagTun, Jan 15 '16 at 18:50

clarity123 · Accepted Answer · 2016-01-15T20:12:38.933

Updated: To exclude id="textmarker while including id="anythingelse and all other spans:

(<span(?! *id="textmarker)[^>]*>)

On your posted example at: https://regex101.com/r/yT9jG2/2 , and at the top, choosing version 2, set the fields so:

field 1: (<span(?! *id="textmarker)[^>]*>)
field 2, (the smaller field that lets you set modifier): g

With your example and choosing version 2, matches 9 and lists them on the right, including empty spans as well as non-id="textmarker such as <span id="YellowType">

Explanation

Field 1:

optional: ( and ). An extra outer parenthesis was added to the expression for educational purposes, just for making use of regex101's matched group listing feature to list results on the right pane in addition to the default inline highlighting of matches. When using Notepad++ you can of course omit these outer ( ) parentheses.
<span: matches <span
(?! starts a negative lookahead assertion for the following,
* meaning space zero or more times, in case you have extra spaces
followed by id="textmarker
) to end the negative lookahead assertion
so if the match sees the negative lookahead assertion it automatically discards that as a match
[^ starts an exclusion set. so not of of the following, the following being the >
] to stop defining the exclusion
* to match the preceding 0 or more times. The preceding being [^>]
> to match to end of the open-a-span tag

Field 2

g tells regex101 you want this to be a greedy match
so the result does not stop at the first match, but will have all matches

Good try but unfortunately the regex should also match all the other `span id` than ` — MagTun, Jan 15 '16 at 18:52
This detects `i` so it should work to exclude any span id, yes please update the question with more info if not working — clarity123, Jan 15 '16 at 19:05
I'd rather use `\s` instead of space and the "educational purposes" confused before reading your explanation :) extensive answer. — bobble bubble, Jan 15 '16 at 20:24

Regex match a string that doesn't contains a string

1 Answers1

Explanation