How do I interpret this regular expression?

Asked Nov 12 '16 at 13:07

Active Nov 17 '18 at 12:48

Viewed 36 times

I read about the following article which talks about parsing html tag.. http://haacked.com/archive/2004/10/25/usingregularexpressionstomatchhtml.aspx/

I got confused about the usage of forward slash:

</?\w+\s+[\^>]*>

"Roughly Translated, this expression looks for the beginning tag and tag name, followed by some white-space and then anything that doesn’t end the tag."

Why doesn't < means left word boundary here?

What is /? I've searched the web. Some say / is special character, others do not even mention it.

Also how should I interpret [\^>]? any of \ ^ >? It does not make sense.

Could anyone explain this expression a little?

edited Nov 17 '18 at 12:48

halfer

19,824
17
99
186

asked Nov 12 '16 at 13:07

user1559625

2,583
5
37
75

Why don't you explore this regex using a tool like Regex101? Look here: https://regex101.com/r/lG6ZH0/1 – Tim Biegeleisen Nov 12 '16 at 13:12
By the way, there is nothing special about forward slash in regex, but you might have to escape it in some engines. – Tim Biegeleisen Nov 12 '16 at 13:12
@Tim Biegeleisen Thanks, Tim. Great website and it explains well. Btw, some article says < and > are word boundary, and / is regex delimiter, not sure if they are true. It seems in this case, <> are plain literal and / is treated as special character. – user1559625 Nov 12 '16 at 22:40

How do I interpret this regular expression?

0 Answers0