-2

I made an xml-file from an html-file and I become the the following XML-file.

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE article>
<article xmlns="http://docbook.org/ns/docbook" version="5.0" xmlns:xlink="http://www.w3.org/1999/xlink">
  <informaltable>
    <tgroup cols="2">
      <colspec colwidth="50*" align="left"/>
      <colspec colwidth="50*" align="left"/>
      <tbody>
        <row>
          <entry>
            Datum
          </entry>
          <entry>
            Vrijdag 14 januari 2022
          </entry>
        </row>
        <row>
          <entry>
            Begeleiding -
            <emphasis>grote kogel 15cm in tas</emphasis>
          </entry>
          <entry>
            some text
          </entry>
        </row>
        <row>
          <entry>
            Bijlage
          </entry>
          <entry>
            <para>
              <link xlink:href="https://index.php?" role="ssLink">file.pdf</link>
              (280 KiB)
            </para>
            <para>
              door de
              <emphasis role="strong">school</emphasis>
              rest van de zin
            </para>
          </entry>
        </row>
      </tbody>
    </tgroup>
  </informaltable>
</article>

When I use

$xml_data = simplexml_load_string($filedata);
$xml_data_json = json_decode(json_encode($xml_data), 1);

I got following response

array(2) {
  ["@attributes"]=>
  array(1) {
    ["version"]=>
    string(3) "5.0"
  }
  ["informaltable"]=>
  array(1) {
    ["tgroup"]=>
    array(3) {
      ["@attributes"]=>
      array(1) {
        ["cols"]=>
        string(1) "2"
      }
      ["colspec"]=>
      array(2) {
        [0]=>
        array(1) {
          ["@attributes"]=>
          array(2) {
            ["colwidth"]=>
            string(3) "50*"
            ["align"]=>
            string(4) "left"
          }
        }
        [1]=>
        array(1) {
          ["@attributes"]=>
          array(2) {
            ["colwidth"]=>
            string(3) "50*"
            ["align"]=>
            string(4) "left"
          }
        }
      }
      ["tbody"]=>
      array(1) {
        ["row"]=>
        array(3) {
          [0]=>
          array(1) {
            ["entry"]=>
            array(2) {
              [0]=>
              string(29) "
            Datum
          "
              [1]=>
              string(47) "
            Vrijdag 14 januari 2022
          "
            }
          }
          [1]=>
          array(1) {
            ["entry"]=>
            array(2) {
              [0]=>
              string(50) "
            Begeleiding -
            
          "
              [1]=>
              string(33) "
            some text
          "
            }
          }
          [2]=>
          array(1) {
            ["entry"]=>
            array(2) {
              [0]=>
              string(31) "
            Bijlage
          "
              [1]=>
              array(1) {
                ["para"]=>
                array(2) {
                  [0]=>
                  array(1) {
                    ["link"]=>
                    string(8) "file.pdf"
                  }
                  [1]=>
                  string(80) "
              door de
              
              rest van de zin
            "
                }
              }
            }
          }
        }
      }
    }
  }
}

The problem is that all the emphasis, link elements and link attributes are gone, but I need them in my response.

[1]=>
              array(1) {
                ["entry"]=>
                array(2) {
                  [0]=>
                  string(50) "
                Begeleiding - the text from emphasis here....
              "
                  [1]=>
                  string(33) "
                some text
              "
                }
              }

Could someone help me? Thanks!

This is my first post on stackoverflow, and apparently i wrote too much code in this post instead of text. Is there another way to make long code shorter in stackoverflow?

  • Does this answer your question? [Resolve namespaces with SimpleXML regardless of structure or namespace](https://stackoverflow.com/questions/26400993/resolve-namespaces-with-simplexml-regardless-of-structure-or-namespace) – Chris Haas Sep 17 '22 at 12:28
  • Please edit your question and add a short, representative sample of `$filedata`, including the namespace declarations. – Jack Fleeting Sep 17 '22 at 12:33

1 Answers1

0

Note that your sample xml isn't well formed. But assuming it's fixed, what you want to do can be done.

Since you are dealing with xml, you might as well stick to simplexml. In addition, you have to deal with the namespaces. So altogether:

$xml_data->registerXPathNamespace('xxx', "http://docbook.org/ns/docbook");
$targets = $xml_data->xpath('//xxx:emphasis');
foreach ($targets as $target) {
    echo $target . "\n";
}

Output:

grote kogel 15cm in tas          
school
    
Jack Fleeting
  • 24,385
  • 6
  • 23
  • 45