You want to use one of the most complex XML php parsers.
XMLReader is a XML Pull parser: unlike Document Object Model, XMLReader does not load the entire document into memory at once, but instead reads the file sequentially. This means that with it you can not directly find for a specific node, but you have to read XML file until you find desired node.
It is the mandatory choice when you have to parse very large XML file, but even in these cases can be more comfortable if used in conjunction with a DOM parser, because XMLReader is very complex in terms of development.
In the following examples I will show you how to obtain your array using pure XMLReader compared with the two more diffuse parsers: SimpleXML and DOMDocument. SimpleXML is probably the most used and it is considered most performant that DOMDocument. On the contrary, DOMDocument is more powerful and customizable. Both SimpleXML and DOMDocument supports a (limited) XPath query system, that allow you to simplify navigation inside XML.
Here you can find a detailed comparison of almost all available XML/HTML parsers.
Please note carefully:
Stack Overflow is not intended as free copy-and-paste code service. Following examples are intended as illustrate different parser behavior. I have tested each example with this XML sample based on your example, but presumably your document structure is more complex, so you have to adapt the code to it. This is especially true if you have same node names in different tree position.
In all following examples the $result
final array is this:
Array
(
[0] => Android
[1] => Android, iOS, Windows
)
Using XMLReader:
$xml = new XMLReader(); # Init XMLReader
$xml->open( "file://$filePath" ); # Open XML File
$result = array(); # Init result array
while( $xml->read() ) # Main read loop
{
/* If current node is item, start analyzing it: */
if( $xml->name == 'item' )
{
/* Create additional XMLReader for item node: */
$node = new XMLReader();
/* Load item XML: */
$node->xml( $xml->readOuterXML() );
/* Init $platforms array: */
$platforms = array();
while( $node->read() )
{
/* Continue reading until platformType if found: */
while( $node->read() && $node->name !== 'platformType' );
/* Add platformType value to $platforms array: */
if( $node->readInnerXML() ) $platforms[] = $node->readInnerXML();
/* Continue reading until platformType closing tag if found: */
while( $node->read() && $node->name !== 'platformType' );
}
/* Add imploded $platforms to $result: */
$result[] = implode( ', ', $platforms );
/* Continue reading until item closing tag if found: */
while( $xml->read() && $xml->name !== 'item' );
}
}
As you can see, to construct a very simple array we have to write a lot of code. Consider that this is only an example: in real world, you need to refine above code with additional checks to avoid an infinite loop. As mentioned before, you can use XMLReader in conjunction with a DOM parser. In this example, can be a good idea to replace $node = new XMLReader()
with simplexml_load_string( $xml->readOuterXML() )
. Here you can find a detailed example of using XMLReader with SimpleXML.
Using SimpleXML:
$xml = simplexml_load_file( $filepath ); # Load XML File into SimpleXML Object
$result = array(); # Init result array
/* Process each <item> node: */
foreach( $xml->xpath( '//item' ) as $node )
{
$platforms = array();
/* Process each <platform> node: */
foreach( $node->platform as $platform )
{
/* Add platformType value to $platforms array: */
$platforms[] = $platform->platformType[0]->__toString();
}
$result[] = implode( ', ', $platforms );
}
In this example, like in following, I have omitted comments in lines identical to XMLReader example.
I use ->xpath
to select <item>
nodes: it is not necessary in the actual XML sample, because <item>
nodes are direct children of root (we can select its by $xml->item
), but this XPath pattern will work also if <item>
nodes are in a deepest position. The //
at pattern start means “Find following pattern no matter where it is”.
The syntax is more simple that XMLReader: you can directly go to desired node(s) by ->nodeTag
syntax or using a XPath expression. Note that both ->
and XPath return arrays, so to refer it you have to use array syntax (->platformType[0]
). SimpleXML return always SimpleXMLElement objects, so to use it as string you have to cast its as string with ->__toString()
or with $string = ($string) $platform->platformType[0]
. In the example you can omit the casting, because implode()
will cast objects as strings for you.
Using DOMDocument:
$dom = New DOMDocument(); # Init DOMDocument Object
$dom->load( $filepath ); # Load XML File into DOMDocument Object
$xpath = new DOMXPath( $dom ); # Init DOMXpath Object
$result = array();
foreach( $xpath->query( '//item' ) as $node )
{
$platforms = array();
/* Process each <platformType> node: */
foreach( $node->getElementsByTagName( 'platformType' ) as $platformType )
{
/* Add platformType value to $platforms array: */
$platforms[] = trim( $platformType->nodeValue );
}
$result[] = implode( ', ', $platforms );
}
After reading first examples, this last example is self explanatory. You can note different syntax to select nodes (->query
and ->getElementsByTagName
don't return an array, but a DOMNodeList object: you can refer to each set node with ->item(0)
syntax). Also, you can retrieve node value using ->nodeValue
without casting it.
Now, you can choice your preferred parser.