1

I am trying to find the component in all files with specific attribute. Tried this regex pattern <Button[^>]*[\n\s]+className[^>]*>. 95% it works fine.

Regex Example

You can see in this above example. Button component with condition attribute won't match. It has className attribute too. It should match. It didn't match because of this greater than character => in condition attribute line. So, It stops even before the component close tag.

How do I avoid in between greater than character (>) in this regex pattern?

Andy Lester
  • 91,102
  • 13
  • 100
  • 152
Karuppiah RK
  • 3,894
  • 9
  • 40
  • 80
  • 3
    You shouldn't use regex to parse HTML: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Christian Baumann Oct 05 '20 at 07:29
  • or in fact anything other than the most simple of cases (IMO) regex produces hard to read and hard to debug code and it seems it may not even be possible to parse html with regex. – Tim Rutter Oct 05 '20 at 07:41
  • Sorry, but what do you want to get as a result? All entries between button tags (including all atributes)? – Maciej Los Oct 05 '20 at 08:06
  • @MaciejLos Yes. Get all results of button tags with className attributes. not all attributes – Karuppiah RK Oct 05 '20 at 08:07
  • @ChristianBaumann Ok. But, if I need to use condition in attributes I have to use it in open & close curly brackets `{}`. So, if close tag used `>` in between curly brackets, it should be avoided. Any way like this? – Karuppiah RK Oct 05 '20 at 08:57
  • 3
    ` – Wiktor Stribiżew Oct 05 '20 at 09:07
  • @WiktorStribiżew Can you post your comment as an answer with explanation. It will be useful for me and other users.. :-) – Karuppiah RK Oct 05 '20 at 09:12
  • @WiktorStribiżew It works fine in above link. But, it doesn't work as expected in VS code editor. Error messages show _incomplete quantifier_ – Karuppiah RK Oct 05 '20 at 09:26
  • @WiktorStribiżew Tried this ` – Karuppiah RK Oct 05 '20 at 09:33
  • 1
    @ChristianBaumann Please don't use that link for explaining why you shouldn't use regexes for HTML parsing. OP will not understand it. Here's a more illustrative page I put together: http://htmlparsing.com/regexes.html – Andy Lester Oct 05 '20 at 21:01

1 Answers1

1

You need to match any char but > or an attribute (a chunk of word chars) followed with = and then a substring between curly braces one or more times with (?:\w+=\{[^{}]*\}|[^>])*.

Also, you should keep in mind Visual Studio Code regex engine requires { and } outside of a character class to be escaped.

The pattern will look like

<Button(?:\w+=\{[^{}]*\}|[^>])*\sclassName=(?:\w+=\{[^{}]*\}|[^>])*>

See the regex demo.

Details

  • <Button - a literal string
  • (?:\w+=\{[^{}]*\}|[^>])* - zero or more repetitions of
    • \w+=\{[^{}]*\} - one or more letters, digits or underscores, ={, zero or more chars other than { and } and then a }
    • | - or
    • [^>] - any char other than >
  • \s - a whitespace
  • className= - a literal text
  • (?:\w+=\{[^{}]*\}|[^>])* - see above
  • > - a > char.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563