JavaScript Regular expression text parsing

Question

I have a string like the following

~~<b>A<i>C</i></b>~~/~~<u>D</u><b>B</b>~~has done this.

I am trying to get the text inside tag. I am trying

<b>(.+)</b>

But I am getting AC~~/~~DB, but I need AC as first match and B as the second match

Can anyone please help?

Can you post the regular expression that you've already tried?... — War10ck, Apr 24 '14 at 15:01
Thou shall not try to parse html with regular expressions. Seriously, in most cases you will end up with a pile of unmaintainable and error-prone code - just imagine Nested sections of interest (eG. ...............). Make sure that you have a **very** compelling reason to choose this path. — collapsar, Apr 24 '14 at 15:09
The following questions in the [Stack Overflow Regular Expressions FAQ](http://stackoverflow.com/a/22944075/2736496) may be of interest: [In-depth discussion on the differences between greedy versus non-greedy](http://stackoverflow.com/a/3075532) (listed under "Quantifiers"), and [Don't use regex to parse HTML](http://stackoverflow.com/questions/590747/using-regular-expressions-to-parse-html-why-not) (under "General Information"). Please consider bookmarking the FAQ for future reference. — aliteralmind, Apr 24 '14 at 15:12
This parsing will be under HTML5 canvas where I have to parse and create canvas text. I am doing this because I can not use html text inside canvas — user3306669, Apr 24 '14 at 15:21

score 3 · Accepted Answer · answered Apr 24 '14 at 15:02

3

You need to use a non-greedy quantifier:

<b>(.+?)</b>

This will ensure that the match stops at the first  it finds.

However, I would generally recommend using a proper XML or HTML parser for this sort of thing. Regular expressions are simply not powerful enough to handle the recursive structure of XML.

answered Apr 24 '14 at 15:02

p.s.w.g

146,324
30
291
331

As soon as`` gets any attributes, this fails. – John Dvorak Apr 24 '14 at 15:05
@JanDvorak true, and that's why I recommend using a proper parser. – p.s.w.g Apr 24 '14 at 15:06

JavaScript Regular expression text parsing

1 Answers1