0

I have an XML file I'm trying to parse using SimpleXML.

This is my code

//loading the xml string with simplexml function
$xml = simplexml_load_file("vrv-uploads/" . $proj_id . ".xml");
 
//looping through every row
foreach ($xml->Workbook->Worksheet->Table->Row as $xml_row)
    {
    echo $xml_row->Cell[0]->Data . '</br>';
}

I've verified that the file opens correctly, but nothing is printed.

Below you can see the xml file in question (I've collapsed Rows after the first as they just repeat on the same format).

My question is: does my code fails to find the date because of those p2/p3 prefixes? If so, how do I account for those? Or am I missing something else?

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet">
  <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office" />
  <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
    <ProtectStucture>False</ProtectStucture>
    <ProtectWindows>False</ProtectWindows>
  </ExcelWorkbook>
  <Styles>
    <Style ss:ID="Default" ss:Name="Normal" xmlns:p3="urn:schemas-microsoft-com:office:spreadsheet">
      <ss:Font ss:FontName="Arial" ss:Size="10" />
      <ss:Alignment ss:Vertical="Top" ss:WrapText="1" />
    </Style>
    <Style ss:ID="Percent" ss:Name="Percent" xmlns:p3="urn:schemas-microsoft-com:office:spreadsheet">
      <ss:NumberFormat ss:Format="0%" />
    </Style>
  </Styles>
  <Worksheet p2:Name="export" xmlns:p2="urn:schemas-microsoft-com:office:spreadsheet">
    <p2:Table p2:ExpandedColumnCount="3" p2:ExpandedRowCount="33" p2:FullColumns="1" p2:FullRows="1">
      <p2:Column p2:Width="140" />
      <p2:Column p2:Width="34" />
      <p2:Column p2:Width="300" />
      <p2:Row>
        <p2:Cell>
          <p2:Data p2:Type="String">Model</p2:Data>
        </p2:Cell>
        <p2:Cell>
          <p2:Data p2:Type="String">Qty</p2:Data>
        </p2:Cell>
        <p2:Cell>
          <p2:Data p2:Type="String">Description</p2:Data>
        </p2:Cell>
      </p2:Row>
</p2:Table>
</Worksheet>
</Workbook>
  • Tried to follow the various suggestions, but I can't find a way to successfully apply it to my case. Do you have any further suggestions? – matteodallombra Nov 16 '21 at 15:28
  • First, your xml is not well formed; please run it through an xml validator and clean it up. Second, what exactly is your expected output? – Jack Fleeting Nov 16 '21 at 16:37
  • Are the "-" and "+" signs actually part of the XML document? If so, you need to fix those first, and make sure you're displaying or logging PHP errors properly, because they will stop the XML being parsed at all. If not, make sure you copy from a plain text view (like a text editor) not a debug view in future. Then read the reference answer I linked, and try to understand what namespaces _are_ rather than just copying examples, because there is a lot of bad advice out there, and the actual solution is generally quite simple. – IMSoP Nov 17 '21 at 12:34
  • Hello @imsop , the ```-``` came up when copying from Windows xml viewer and are not actually part of the file. The article you've linked is super useful, but I have a follow up question. In the example, the parsing function defines ```XMLNS_EG2``` as a URL, which I assume is where the XML scheme lives. How do I find this, if it's not written in my original XML file? I've edited the original question to show a better write-up of the XML file – matteodallombra Nov 23 '21 at 09:04
  • @matteodallombra As the linked answer says "The URI doesn't point at anything, it's just a way for someone to 'own' the namespace." All you have to do is look for the "xmlns" attributes that define each local prefix or default namespace, e.g. `xmlns="urn:schemas-microsoft-com:office:spreadsheet"` and `xmlns:p2="urn:schemas-microsoft-com:office:spreadsheet"`. So in your code, you might define something like `const XMLNS_SPREADSHEET="urn:schemas-microsoft-com:office:spreadsheet";` to refer to that namespace, regardless of what prefix it has in different parts of the file and other files. – IMSoP Nov 23 '21 at 09:10
  • @imsop got it..but then, shouldn't this work? ```define('XMLNS_SPREADSHEET', 'urn:schemas-microsoft-com:office:spreadsheet'); foreach ( $xml->list->children(XMLNS_SPREADSHEET)->Worksheet->Table as $table ) { echo 'Item: ' . $table->Row->Cell->Data[0] . "\n"; }``` – matteodallombra Nov 23 '21 at 10:09
  • @matteodallombra I'm not sure where `->list` has come from in there, but remove that and it looks right. Remember that `$table->Row->Cell->Data[0]` means the same as `$table->Row[0]->Cell[0]->Data[0]`, so that loop will print the data in the first cell, of the first row, of each table. More likely, you want `foreach ( $xml->children(XMLNS_SPREADSHEET)->Worksheet->Table->Row as $row ) { foreach ( $row->Cell as $cell ) { ... } }` to loop over all rows in the table, and all cells in each row. – IMSoP Nov 23 '21 at 10:15

0 Answers0