0

I read about the following article which talks about parsing html tag.. http://haacked.com/archive/2004/10/25/usingregularexpressionstomatchhtml.aspx/

I got confused about the usage of forward slash:

</?\w+\s+[\^>]*>

"Roughly Translated, this expression looks for the beginning tag and tag name, followed by some white-space and then anything that doesn’t end the tag."

Why doesn't < means left word boundary here?

What is /? I've searched the web. Some say / is special character, others do not even mention it.

Also how should I interpret [\^>]? any of \ ^ >? It does not make sense.

Could anyone explain this expression a little?

halfer
  • 19,824
  • 17
  • 99
  • 186
user1559625
  • 2,583
  • 5
  • 37
  • 75
  • Why don't you explore this regex using a tool like Regex101? Look here: https://regex101.com/r/lG6ZH0/1 – Tim Biegeleisen Nov 12 '16 at 13:12
  • By the way, there is nothing special about forward slash in regex, but you might have to escape it in some engines. – Tim Biegeleisen Nov 12 '16 at 13:12
  • @Tim Biegeleisen Thanks, Tim. Great website and it explains well. Btw, some article says < and > are word boundary, and / is regex delimiter, not sure if they are true. It seems in this case, <> are plain literal and / is treated as special character. – user1559625 Nov 12 '16 at 22:40

0 Answers0