1

I am trying to echo a 4MB XML file, but the file is invalid:

My PHP

<?php

$str = file_get_contents('log.xml');
$xml = simplexml_load_string($str);
$c = 0;
foreach($xml->uR->TS->Location as $loc) 
{
    echo (++$c) . "\n";
    echo (string)$loc->attributes()['tpl'] . "\n";
    echo (string)$loc->arr->attributes()['at'] . "\n";
    echo (string)$loc->dep->attributes()['et'] . "\n\n";
    echo (string)$loc->plat . "\n\n\n";
}
?>

it worked perfectly with the sample xml file

<?xml version="1.0" encoding="UTF-8"?>
<Pport xmlns="http://www.thalesgroup.com/rtti/PushPort/v12" xmlns:ns3="http://www.thalesgroup.com/rtti/PushPort/Forecasts/v2" ts="2018-01-01T21:58:48.2213864Z" version="12.0">
    <uR updateOrigin="Trust">
        <TS>
            <Location pta="21:59" ptd="21:59" tpl="ROBY" wta="21:59" wtd="21:59:30">
                <arr at="21:59" src="TRUST" srcInst="Auto" />
                <dep et="21:59" src="Darwin" />
                <plat conf="true" platsrc="A">4</plat>
            </Location>
            <Location pta="22:06" ptd="22:06" tpl="PRESCOT" wta="22:05:30" wtd="22:06">
                <arr et="22:06" src="Darwin" wet="22:05" />
                <dep et="22:06" src="Darwin" />
                <plat>1</plat>
            </Location>
        </TS>
    </uR>
</Pport>

but after adding the original XML file, It has multiple header in the file itself as shown below:

<?xml version="1.0" encoding="UTF-8"?>
<Pport xmlns="http://www.thalesgroup.com/rtti/PushPort/v12" xmlns:ns3="http://www.thalesgroup.com/rtti/PushPort/Forecasts/v2" ts="2018-01-03T01:31:28.3036616Z" version="12.0">
    <uR requestID="AM02050384" requestSource="AM02" updateOrigin="CIS">
        <TS rid="201801037171519" ssd="2018-01-03" uid="G71519">
            <ns3:Location tpl="GLYNDE" wtp="01:25:08">
                <ns3:pass at="01:31" src="TD" />
                <ns3:plat conf="true" platsrc="A" platsup="true">2</ns3:plat>
                <ns3:length>8</ns3:length>
            </ns3:Location>
        </TS>
    </uR>
</Pport>
<?xml version="1.0" encoding="UTF-8"?>
<Pport xmlns="http://www.thalesgroup.com/rtti/PushPort/v12" xmlns:ns3="http://www.thalesgroup.com/rtti/PushPort/Forecasts/v2" ts="2018-01-03T01:31:29.1772672Z" version="12.0">
    <uR requestID="0000000000046386" requestSource="at21" updateOrigin="CIS">
        <TS rid="201801038706030" ssd="2018-01-03" uid="W06030">
            <ns3:Location pta="01:25" ptd="01:26" tpl="DARTFD" wta="01:25" wtd="01:26">
                <ns3:arr at="01:31" src="TD" />
                <ns3:dep et="01:32" etmin="01:27" src="Darwin" />
                <ns3:plat conf="true" platsrc="A">4</ns3:plat>
            </ns3:Location>
        </TS>
    </uR>
</Pport>
<?xml version="1.0" encoding="UTF-8"?>
<Pport xmlns="http://www.thalesgroup.com/rtti/PushPort/v12" xmlns:ns3="http://www.thalesgroup.com/rtti/PushPort/Forecasts/v2" ts="2018-01-03T01:31:30.1912737Z" version="12.0">
    <uR updateOrigin="TD">
        <TS rid="201801027160109" ssd="2018-01-02" uid="G60109">
            <ns3:Location tpl="BRINKLW" wtp="01:34:30">
                <ns3:pass at="01:31" src="TD" /></ns3:Location>
        </TS>
    </uR>
</Pport>
<?xml version="1.0" encoding="UTF-8"?>
<Pport xmlns="http://www.thalesgroup.com/rtti/PushPort/v12" xmlns:ns3="http://www.thalesgroup.com/rtti/PushPort/Forecasts/v2" ts="2018-01-03T01:31:31.2052802Z" version="12.0">
    <uR updateOrigin="TD">
        <TS rid="201801036763188" ssd="2018-01-03" uid="C63188">
            <ns3:Location tpl="AMBERGJ" wtp="02:04:30">
                <ns3:pass et="01:38" src="TD" /></ns3:Location>
        </TS>
    </uR>
</Pport>

Is there a way to:

  1. Ignore the multiple headers to avoid errors.
  2. echo all data inside.
  3. Post the echo value to a Bootstrap table.

Cheers!

Haitham
  • 33
  • 6
  • replace extra headers with nothing, add root element '' .$s . '' and move one header to start of the string – splash58 Jan 06 '18 at 13:36
  • If this is a file provided by someone else, can you ask them to provide proper XML rather than a bunch of XML data in one file. If it's a one off file, the I would suggest manually editing it to make it valid and work from there. – Nigel Ren Jan 07 '18 at 08:59

1 Answers1

0

You have multiple approach.

First you can always output your file directly - without loading it as an XML document

$handle = fopen('log.xml'), "r") or die("Couldn't get handle");
if ($handle) {
    while (!feof($handle)) {
        echo fgets($handle, 4096);
    }
    fclose($handle);
}

Second you can modify your original file so that simple XML could handle it - this is the approached advised by @splash58 :

replace extra headers with nothing, add root element '' .$s . '' and move one header to start of the string

To do so, you could use preg_replace and other str_replace functions, but it could be somewhat tedious in the long run if any new bad scenarios happens, you have to add new replace statements to make the code work again.

Third, Get well-formed XML in the first place - if the XML you are trying to parse comes from another service, maybe this service could provide some well-formed XML in the first place ! Just ask them.

Fourth use tolerant markup parser - See DOMDocument::$recover and libxml_use_internal_errors(true) (example) as given in the answer here : How to parse invalid (bad / not well-formed) XML?

Cedric
  • 5,135
  • 11
  • 42
  • 61