PHP RegEx match XML pattern multiple times inside pattern

Question

I have an XML document from which I want to extract some data:

<tnt:results>
<tnt:result>
<Document id="id1">
<impact _blabla_ for="tree.def" name="Something has changed"
select="moreblabla">true</impact>
<impact _blabla_ for="plant.def" name="Something else has changed"
select="moreblabla">true</impact>
</Document>
</tnt:result>
</tnt:results>

in reality there is no new line -- it's one continuous string and and there can be multiple < Document > elements. I want to have a regular expression that extracts:

id1
tree.def / plant.def
Something has changed / Something else has changed

I was able to come up with this code so far, but it only matches the first impact, rather than both of them:

preg_match_all('/<Document id="(.*)">(<impact.*for="(.*)".*name="(.*)".*<\/impact>)*<\/Document>/U', $response, $matches);

The other way to do it would be to match everything inside the Document element and pass it through a RegEx once more, but I thought I can do this with only one RegEx.

Thanks a lot in advance!

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags everyone gets it once; i certainly have. — Dan Lugg, Jun 11 '11 at 02:56

score 1 · Accepted Answer · answered Jun 11 '11 at 03:10

Just use DOM, it's easy enough:

$dom = new DOMDocument;
$dom->loadXML($xml_string);

$documents = $dom->getElementsByTagName('Document');
foreach ($documents as $document) {
    echo $document->getAttribute('id');     // id1    

    $impacts = $document->getElementsByTagName('impact');
    foreach ($impacts as $impact) {
        echo $impact->getAttribute('for');  // tree.def
        echo $impact->getAttribute('name'); // Something has changed
    }
}

Yes, I was already writing up the code that seems very similar to yours... Thanks :) — cdavid, Jun 11 '11 at 03:25

score 0 · Answer 2 · answered Jun 11 '11 at 02:54

0

Don't use RegEx. Use an XML parser.

Really, if you have to worry about multiple Document elements and extracting all sorts of attributes, you're much better off using an XML parser or a query language like XPath.

answered Jun 11 '11 at 02:54

Richard JP Le Guen

28,364
7
89
119

PHP RegEx match XML pattern multiple times inside pattern

2 Answers2