1

I want to match multiple data-i18n attributes with a JavaScript regexp.

I tried the following regexp :

var regexp = /(data\-i18n="[^"]+")/g;

which in my head seemed rather straight forward, but it ended up not working.

If you try to match on the following HTML tag :

<a random-attr="ok" data-i18n="first match" data-i18n="second match">my text</a>

doing an exec like this :

/(data\-i18n="[^"]+")/g.exec('<a random-attr="ok" data-i18n="first match" data-i18n="second match">my text</a>')

will raise the following issue :

  • There are two matches, but they are actually duplicate matches.

The result is :

[ 'data-i18n="first match"',
  'data-i18n="first match"',
  index: 20,
  input: '<a random-attr="ok" data-i18n="first match" data-i18n="second match">my text</a>' ]
  • Any ideas on how to have multiple matches for my attribute ?

Thanks in advance !

m_vdbeek
  • 3,704
  • 7
  • 46
  • 77
  • 6
    Oh God no, guess it's time for [**TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ**](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – adeneo Dec 14 '13 at 23:08
  • 2
    `document.getElementsByTagName('a')[0].getAttribute('data-i18n')` – adeneo Dec 14 '13 at 23:09
  • And you can't have two attributes with the same name. – adeneo Dec 14 '13 at 23:12
  • I'm not client-side, so no selectors. And I'm not trying to program a tokenizer for HTML just matching one specific tag which shouldn't be very complicated. – m_vdbeek Dec 14 '13 at 23:12
  • You can't have two attributes with the same name, but you could for example have two tags with the same attribute, which would be the exact same situation. – m_vdbeek Dec 14 '13 at 23:13
  • And then you would have an issue as you're working with strings, and that's why you need a proper DOM parser, not regex. If you're not clientside, where are you, and why isn't the question properly tagged – adeneo Dec 14 '13 at 23:15
  • Node.js, but I thought that since it wasn't using Node.js specific libraries it wouldn't be too useful but I will retag the question. – m_vdbeek Dec 14 '13 at 23:17
  • 1
    The regex you have seems to work for me -> http://jsfiddle.net/GjPUH/ – adeneo Dec 14 '13 at 23:22
  • Hum thanks. For some reason, `exec()` didn't return the right results ... – m_vdbeek Dec 14 '13 at 23:24
  • With `exec` you have to loop. – elclanrs Dec 14 '13 at 23:31

1 Answers1

1

The problem isn't with your regex; its with how you're expecting exec to behave. The return value of exec has the full match at position 0, and then the match of each capture group following that. Since you wrapped the whole regex in a capturing group, you're seeing the same string at positions 0 and 1 of the array.

The right way to use a global regex with exec is to keep calling exec until it returns null; it will return the next match each time. However, if you use String.match(Regexp), it will return what you expect - an array containing all of the matches.

Aaron Dufour
  • 17,288
  • 1
  • 47
  • 69