I have been looking for quite some time to be able to match some class names starting with a specified pattern in an HTML String. Here is the regex I finally came up with:
/(?<=(<[^>]*class=("|')(\w|\ |\-)*))((?<= |\'|\")(foo-|-foo)[^ "']*)/gmi
The sample I have been working with is:
<body>
<div style="margin-left:6px;" class="foo-pink blfoo-pin-foo-kue red yellow bar-green -moz-FF foo-pink moz-FF foo-pink" >
<fieldset class="foo customClass foo clFieldsBar bar-try" id="idField foo- bar-dfgdgdfg">
<legend><span>Qu'en pensez-vous ?</span></legend>
< id="idText" class='foo- Comment_text fdgdgdfg -foo-ddede mso-whitespace' name="nameText barName"></textarea> bar-deded foo-green
</fieldset>
class="blue dffsf sdf mso-green foo"
</div>
You can see the following RegEx doing what I actually want here: https://regex101.com/r/6W5AUT/4
The problem is that I need the regex to be executed in a Delphi code. However, when I do so, i get the following error:
lookbehind assertion is not fixed length
Which, after some quick research (of the reason behind the error), lead me to discover that I cannot use a negative lookbehind with a variable length.
I have been trying to transform my RegEx using different methods (\K to reset the match for example), and this is what I came up with so far:
/(<[^>]*class=("|'))(\b(foo)(\w|\-)*)*/gmi
You can see it working here: https://regex101.com/r/zeQDrK/2
As you can see, it is only matching the first class name of each class attribute in a tag.
Now to be precise about what I need:
- Match all occurences of a class name that start with a pattern (it can be "foo", "-foo" or a combination of both),
- It needs to match only the class names that are in an html tag (this is why you can see the class="blue dffsf sdf mso-green foo" outside of an HTML tag),
- It needs to support both class="ClassName1 ClassName2" or class='ClassName1 ClassName2'
I would appreciate any help to solve this problem. Thanks for your time.