1

I have this string: <own:egna attribute1="1" attribute2="2">test</own:egna>
I want to catch all attributes with a regexp.

This regexp matches one attribute: (\s+attribute\d=['"][^'"]+['"])
But why is it that appending a + like ``(\s+attribute\d=['"][^'"]+['"])+` actually only returns the last matched attribute and not all of them?

How would you change this to return all attributes in separate groups? I'm actually having more regexp around this, so using functions such as python's findall and equivalents won't do.

baloo
  • 7,635
  • 4
  • 27
  • 35

1 Answers1

0

The short answer is you can't - only the last group is accessible. The Python docs state this explicitly:

If a group matches multiple times, only the last match is accessible [...]

You'll have to use some language features:

  1. In PHP, there's preg_match_all that returns all matches.
  2. In other languages, you'll have to do this manually: add the g modifier to the regex and loop over it. Perl, for example, will manage a string position and return the next match in $1 each time a /([...])/g pattern is matched.

Also take a look at Capturing a repeated group.

Community
  • 1
  • 1
MvanGeest
  • 9,536
  • 4
  • 41
  • 41