5

I made a PHP script which parses XML file and when I try to parse it, an error comes out:

2: DOMDocument::load(): Namespace prefix edf for represent on info is not defined in /users/zzz/testing/meta.xml, line: 2

I've been searching for a fix but I couldn't find any, so I'm posting here. As you can see I'm using DOMDocument class.

My code for parsing XML looks like:

$dom = new DOMDocument();
$metaXML = $dom->load($path."/meta.xml");

The path and all is correct, I'm sure. When I remove the prefix, it works fine. The XML looks like:

<meta>
    <info gamemodes="race" type="map" edf:represent="false"></info>
</meta>

The edf:represent="false" causes an error. I don't want to manually delete edf namespace prefix, because this is not the only XML file I want to parse. There are hundreds of them and the number is rising.

So, my question is, how can I ignore this error (only for XML namespace thing) or how can I define/remove namespace prefix via DOMDocument class?

kjhughes
  • 106,133
  • 27
  • 181
  • 240
GTX
  • 727
  • 2
  • 13
  • 30

2 Answers2

3

The XML file itself is not namespace-well-formed because it uses an undeclared namespace prefix. Either remove the undeclared namespace prefix, or declare it, e.g:

<meta xmlns:edf="http://www.example.com/">
    <info gamemodes="race" type="map" edf:represent="false"></info>
</meta>

Update: You cannot perform this operation using an XML library because the XML is not well-formed. You have to either do it manually or operate on the file programmatically as text, not XML. Once you make your text be well-formed XML, you'll be able to use standard XML libraries to process it.

Here's a programmatic, text-based edit suggestion by @Daniel:


If you need to correct this problem across many files, consider using a tool like 'sed' to replace your meta tag with the corrected version. For example, to replace all instances of <meta with <meta xmlns:edf="http://www.example.com/", within a folder. You could use this command

sed -i -- 's/<meta/<meta\ xmlns\:edf\=\"http\:\/\/www.example.com\/\"/g' *

See https://unix.stackexchange.com/questions/112023/how-can-i-replace-a-string-in-a-files for more infomation on how to use sed.


Well-formed XML should always be parsed using an XML parser, but sometimes a quick-and-dirty fix such as the above can help get us there.

See also:

kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • How can I define or remove it automatically with PHP? As I said, I have too much XML files like this and I don't want to change one by one. – GTX Mar 23 '16 at 16:37
  • **Saw update** Hmm, okay, I understand. But is there any way to check if XML is 'not well-formed'? Because right now, it returns an error and just exits the program. I tried with if statement `if(!metaXML) { ... }` but it still returns an error. – GTX Mar 23 '16 at 16:48
  • The way to check if XML is well-formed is simply to parse it. You'll get errors by definition if it's not well-formed. – kjhughes Mar 23 '16 at 17:52
3

This is an warning not an error. So the XML still can be used, but it is broken. The best solution would be to repair the XML - defining the namespace.

Defining the namesapce will not work automatically. The namespace prefix is only an alias the actual namespace is the value of the xmlns attribute. The alias is only valid for the element and its descendants. The script/application that generates the XML has to be repaired so that it adds the namespace definition.

<meta xmlns:edf="urn:example">
    <info gamemodes="race" type="map" edf:represent="false"></info>
</meta>

The parser will resolve the namespace. You can read "edf:represent" as "{urn:example}represent".

However you can block the parsing errors and warnings using libxml_use_internal_errors().

$xml = <<<'XML'
<meta>
    <info gamemodes="race" type="map" edf:represent="false"></info>
</meta>
XML;

libxml_use_internal_errors(TRUE);

$dom = new DOMDocument();
$dom->loadXml($xml);

echo $dom->saveXml();

Output:

<?xml version="1.0"?>
<meta>
    <info gamemodes="race" type="map" represent="false"/>
</meta>

With libxml_get_errors() you can implement your own error handling.

As you can see in the output, the XML parser removed the namespace prefix. This means that "represent" is now an attribute without an namespace, it changed its identity. Be really careful with that, represent and {urn:example}represent are two different names, you loose relevant context information.

ThW
  • 19,120
  • 3
  • 22
  • 44
  • Thank you, I didn't know about **libxml_use_internal_errors** function. Now it works. – GTX Mar 23 '16 at 17:45