I guess I'm a bit late to the party but I have written a small function to accomplish this task. It also takes care of attributes, text content and even if multiple nodes with the same node-name are siblings.
Dislaimer:
I'm not a PHP native, so please bear with simple mistakes.
function xml2js($xmlnode) {
$root = (func_num_args() > 1 ? false : true);
$jsnode = array();
if (!$root) {
if (count($xmlnode->attributes()) > 0){
$jsnode["$"] = array();
foreach($xmlnode->attributes() as $key => $value)
$jsnode["$"][$key] = (string)$value;
}
$textcontent = trim((string)$xmlnode);
if (count($textcontent) > 0)
$jsnode["_"] = $textcontent;
foreach ($xmlnode->children() as $childxmlnode) {
$childname = $childxmlnode->getName();
if (!array_key_exists($childname, $jsnode))
$jsnode[$childname] = array();
array_push($jsnode[$childname], xml2js($childxmlnode, true));
}
return $jsnode;
} else {
$nodename = $xmlnode->getName();
$jsnode[$nodename] = array();
array_push($jsnode[$nodename], xml2js($xmlnode, true));
return json_encode($jsnode);
}
}
Usage example:
$xml = simplexml_load_file("myfile.xml");
echo xml2js($xml);
Example Input (myfile.xml):
<family name="Johnson">
<child name="John" age="5">
<toy status="old">Trooper</toy>
<toy status="old">Ultrablock</toy>
<toy status="new">Bike</toy>
</child>
</family>
Example output:
{"family":[{"$":{"name":"Johnson"},"child":[{"$":{"name":"John","age":"5"},"toy":[{"$":{"status":"old"},"_":"Trooper"},{"$":{"status":"old"},"_":"Ultrablock"},{"$":{"status":"new"},"_":"Bike"}]}]}]}
Pretty printed:
{
"family" : [{
"$" : {
"name" : "Johnson"
},
"child" : [{
"$" : {
"name" : "John",
"age" : "5"
},
"toy" : [{
"$" : {
"status" : "old"
},
"_" : "Trooper"
}, {
"$" : {
"status" : "old"
},
"_" : "Ultrablock"
}, {
"$" : {
"status" : "new"
},
"_" : "Bike"
}
]
}
]
}
]
}
Quirks to keep in mind:
Several tags with the same tagname can be siblings. Other solutions will most likely drop all but the last sibling. To avoid this each and every single node, even if it only has one child, is an array which hold an object for each instance of the tagname. (See multiple "" elements in example)
Even the root element, of which only one should exist in a valid XML document is stored as array with an object of the instance, just to have a consistent data structure.
To be able to distinguish between XML node content and XML attributes each objects attributes are stored in the "$" and the content in the "_" child.
Edit:
I forgot to show the output for your example input data
{
"states" : [{
"state" : [{
"$" : {
"id" : "AL"
},
"name" : [{
"_" : "Alabama"
}
]
}, {
"$" : {
"id" : "AK"
},
"name" : [{
"_" : "Alaska"
}
]
}
]
}
]
}