Somebody can helpme to extract a text with RegExp?

Question

today i break my head with a regex. I can't extract a part of text. My text is like this:

<!--TEXT[title]-->
sometext 1
<!--END-->
<!--TEXT[title]-->
sometext 2
<!--END-->

I want get this in a array

["title]-->sometext1"
,"title]-->sometext2"]

i have this regex code mytext.match(//m);

Is this text inside of some HTML? If so, don't parse HTML with a regex: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 Instead, parse the DOM. — , Oct 20 '11 at 14:37

score 3 · Accepted Answer · answered Oct 20 '11 at 14:43

3

Assuming you need a regular expression the following should work:

<\!--TEXT\[([^\]]*)\]-->\s*\n(.*)(?!<\!--END-->)

If this text is in a DOM it would be much better to parse the DOM however.

Explanation:

<\!--TEXT\[ // Match the start.
([^\]]*) // Match (in group 1), everything up until the next ']'
\]-->\s*\n // Match to the end of this line.
(.*) // Match anything (in group 2).
(?!<\!--END-->) // Stop before the end tag is next. (This will mean you get everything up to, but not including the previous line break).

answered Oct 20 '11 at 14:43

Vala

5,628
1
29
55

1

Of course this will fail with nested comments, but this is something the OP should know... – FailedDev Oct 20 '11 at 14:49
Yes, if you're dealing with nested comments you want a lexer or a DOM. On the other hand in this particular case it doesn't look like they would be nested (without there being some error). – Vala Oct 20 '11 at 14:56

Somebody can helpme to extract a text with RegExp?

1 Answers1