0

I have a xml file and I want to parse the author list in this XML file.

            <AuthorList CompleteYN="Y">
            <Author ValidYN="Y">
                <LastName>Goyette-Desjardins</LastName>
                <ForeName>Guillaume</ForeName>
                <Initials>G</Initials>
                <AffiliationInfo>
                    <Affiliation>University 1.</Affiliation>
                </AffiliationInfo>
            </Author>
            <Author ValidYN="Y">
                <LastName>Auger</LastName>
                <ForeName>Jean-Philippe</ForeName>
                <Initials>JP</Initials>
                <AffiliationInfo>
                    <Affiliation>University 2</Affiliation>
                </AffiliationInfo>
            </Author>
            <Author ValidYN="Y">
                <LastName>Xu</LastName>
                <ForeName>Jianguo</ForeName>
                <Initials>J</Initials>
                <AffiliationInfo>
                    <Affiliation>University 3</Affiliation>
                </AffiliationInfo>
            </Author>
        </AuthorList>

I used this code the get the author names which have Lastname and Initial but I got only the last author with my code (Xu J).

$api_xml_url = "https://example.com&pid=3123133213&retmode=xml";                            
$xml = file_get_contents($api_xml_url); 
        preg_match_all("'<LastName>(.*?)</LastName>'si", $xml, $match);
        foreach($match[1] as $LastName) {
        $LastName = strip_tags($LastName);
        }
        preg_match_all("'<Initials>(.*?)</Initials>'si", $xml, $match);
        foreach($match[1] as $Initials) {
        $Initials = strip_tags($Initials);
        }
        $authors = $LastName.$Initials;     

How to get the full author list (Goyette-Desjardins G; Auger JP; Xu J). Thank you very much

  • No, to parse an xml file you don't have to use regex or any direct string approach. Use XMLReader or DOMDocument. The file is already structured, use the structure. – Casimir et Hippolyte Mar 25 '18 at 03:12

1 Answers1

1

You should not use regex to parse XML, instead, use something like simplexml_load_string().

<?php
$api_xml_url = "https://example.com&pid=3123133213&retmode=xml";                            
$string = file_get_contents($api_xml_url);

$xml = simplexml_load_string($string);

foreach ($xml as $item) {
    echo $item->ForeName.' '.$item->LastName.' ('.$item->Initials.')'.PHP_EOL;
}

https://3v4l.org/FeTsa

Result:

Guillaume Goyette-Desjardins (G)
Jean-Philippe Auger (JP)
Jianguo Xu (J)
Lawrence Cherone
  • 46,049
  • 7
  • 62
  • 106