python re module group, how to extract all matching group?

Question

I have something confuse about the re module.
Supose I have the following text:

<grp>    
  <i>i1</i>    
  <i>i2</i>    
  <i>i3</i>    
  ...    
</grp>

I use the following re to extract the  part of the text:

>>> t = "<grp>      <i>i1</i>      <i>i2</i>      <i>i3</i>      ...    </grp>"
>>> import re
>>> re.match("<grp>.*(<i>.*?</i>).*</grp>", t).group(1)
'<i>i3</i>'
>>>

I only get the last match items.

My question is how can extract all the match items using only reg expression? for example: extract i1 i2 i3 in a list ['i1', 'i2', 'i3']

Thanks a lot!

Why can't you use two regular expressions for this specific case? There's not much point in having regular expressions that are too large to handle for yourself unless you need them for performance. Anyway, obligatory reading: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 — Qantas 94 Heavy, Jul 02 '14 at 02:48

score 2 · Answer 1 · answered Jul 02 '14 at 02:47

2

You can easily do that using re.findall():

import re
result = re.findall("<i>.*?</i>", t)

>>> print result
['<i>i1</i>', '<i>i2</i>', '<i>i3</i>']

answered Jul 02 '14 at 02:47

sshashank124

31,495
9
67
76

score 2 · Answer 2 · answered Jul 02 '14 at 02:48

2

Why don't use an XML parser, like xml.etree.ElementTree from Python standard library:

import xml.etree.ElementTree as ET

data = """
<grp>
  <i>i1</i>
  <i>i2</i>
  <i>i3</i>
</grp>
"""

tree = ET.fromstring(data)
results = tree.findall('.//i')
print [ET.tostring(el).strip() for el in results]
print [el.text for el in results]  # if you need just text inside the tags

Prints:

['<i>i1</i>', '<i>i2</i>', '<i>i3</i>']
['i1', 'i2', 'i3']

answered Jul 02 '14 at 02:48

alecxe

462,703
120
1,088
1,195

@hwnd thanks, this is what I feel about it too - using specialized tools for specialized tasks, batteries are there, just import them :) – alecxe Jul 02 '14 at 03:10

python re module group, how to extract all matching group?

2 Answers2