I have to parse many documents xml like this:
<doc id=lk-20130223040102_592>
<meta-info>
<tag name="date">2013-02-22</tag>
<tag name="source-encoding">ISO-8859-1</tag>
</meta-info>
<text><SE><E type="E:PERSON">Tom Taylor</E>, who runs <E type="E:ORGANIZATION:CORPORATION">MF&B Marine Warehouse</E> in <E type="E:LOCATION:OTHER">Hampton Roads</E>, is already watching contracts with the <E type="E:ORGANIZATION:GOVERNMENT">Navy</E> <E type="E:PER_DESC">dry</E> up at his small ship-repair <E type="E:ORG_DESC:CORPORATION">business</E>.</SE>
</text></doc>
<doc ...</doc>
I made a simple script to parse one of these:
<?php
$xml=simplexml_load_file('wp7-lk-20130223040102.xml');
foreach ($xml->doc as $doc){
echo $doc['id'];
echo "<br>";
}
?>
but it will return a set of warning like this:
Warning: simplexml_load_file(): ^ in C:\wamp\www\parse_xml.php on line 6
I noticed some errors (id = ... rather than id = "...") (parent element is missing) and I corrected what I could, but there are also many others.
Is there any function to help me to correct errors automatically xml?