2

I'm trying to create a regular expression to match a string like <tag>something</tag> and I want the result to return only something without the tags.

i tried using:

string.match(/(?<=<tag>).*?(?=<\/tag>)/g);

but its giving an error:

SyntaxError:Invalid regular expression: /(?<=<tag>).*?(?=<\/tag>)/: Invalid group;

why is it not working?

putvande
  • 15,068
  • 3
  • 34
  • 50
razz
  • 9,770
  • 7
  • 50
  • 68
  • Related: [lookbehind in javascript?](http://stackoverflow.com/questions/11597718/lookbehind-in-javascript) – apsillers Aug 19 '13 at 18:40
  • 1
    `<([A-Z][A-Z0-9]*)\b[^>]*>(.*?)\1>` from http://www.regular-expressions.info/examples.html – km6zla Aug 19 '13 at 18:42
  • 1
    obligatory link: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – John Dvorak Aug 19 '13 at 18:46
  • @Jan i'm not trying to parse html, i just need to convert string which has tags like structure – razz Aug 19 '13 at 18:57
  • @razzak is that a structure eerily similar to XML but not really XML or SGML? What it _is_, then? – John Dvorak Aug 19 '13 at 18:58
  • @Jan it looks like XML but doesn't have a valid structure so i can't parse it with XML parsers – razz Aug 19 '13 at 19:01
  • @razzak is that a standard format? If yes, please point me to it. If not, I recommend using a standard format instead. Say, XML or JSON. – John Dvorak Aug 19 '13 at 19:03
  • 1
    @Jan i can't change the format because it's done on the server side, i wish i could change it :) – razz Aug 19 '13 at 19:11

3 Answers3

5

I think you will like this

/(?:<(tag)>)((?:.(?!<\/\1>))+.)(?:<\/\1>)/g

Regular expression visualization

visualize the match, luke!

This is handy because the \1 backreference matches tag pairs


Use it on text like this

var re  = /(?:<(tag)>)((?:.(?!<\/\1>))+.)(?:<\/\1>)/g,
    str = "this is <tag>some text</tag> and it does <tag>matching</tag>",
    match;

while ((match = re.exec(str)) !== null) {
  console.log(match[1], match[2]);
};

Output

tag some text
tag matching

Bonus Soda! You can simply modify (tag) to be (tag|bonus|soda) to have it work on this string

<tag>yay</tag> and there's even <bonus>sodas</bonus> in <soda>cans</soda>

Beware If you nest tags, you would have to apply this regexp recursively.

Community
  • 1
  • 1
Mulan
  • 129,518
  • 31
  • 228
  • 259
0

It looks like you're trying to use a lookbehind, which JavaScript does not support. You'd have to change the pattern to something like this:

/<tag>(.*?)<\/tag>/g

And then extract the appropriate group. For example:

/<tag>(.*?)<\/tag>/g.exec('<tag>something</tag>')[1]; // something
p.s.w.g
  • 146,324
  • 30
  • 291
  • 331
  • it doesn't work, it includes the tags in the results – razz Aug 19 '13 at 18:42
  • @razzak As I mentioned, you'd have to extract the group. – p.s.w.g Aug 19 '13 at 18:43
  • how can i make it work on multiple matches? like if i have 'onetwo' and i want to extract the values to an array – razz Aug 19 '13 at 18:51
  • Just for the sake of completeness: It would be possible to make this work on multiple matches by looping in some way (and always removing / not respecting / replacing an already checked part of the original string). However, the accepted answer is accepted rightfully so because it's simpler, cleaner and more performant. – sborn May 11 '20 at 19:40
0

Try this:

var result = /<tag>(.*?)<\/tag>/.exec(myString,"g");

The text between the is then in result[1].

fred02138
  • 3,323
  • 1
  • 14
  • 17
  • how can i make it work on multiple matches? like if i have 'onetwo' and i want to extract the values to an array – razz Aug 19 '13 at 18:54