4

The root element have namespace declarations like xmlns:xlink="http://www.w3.org/1999/xlink" ... so, any node appended (ex. by appendChild) will accept the namespace. I can append <graphic xlink:href=".."/> because on the whole it is valid... But to append a fragment I need first to create the fragment with createDocumentFragment().

Example:

    $tmp = $dom->createDocumentFragment();
    $ok = $tmp->appendXML('<graphic xlink:href="file123.ext"/>');

when running, generates an error, DOMDocumentFragment::appendXML(): namespace error : Namespace prefix xlink for href on inline-graphic is not defined

How to say "use the DomDocument namespaces" to the DOMDocumentFragment::appendXML() method?


NOTES AND CONTEXTS

(transfered as an answer, to not polute here)

Community
  • 1
  • 1
Peter Krauss
  • 13,174
  • 24
  • 167
  • 304

4 Answers4

2

It looks like it's working the way it's supposed to. Check out bug report #44773. chregu@php.net says it's not a bug and works properly. Though I would agree with the bug report and other comments, that since the fragment is made off of the DOMDocument, and it has the namespaces defined it should in fact know what they are and should work without problem.

Pass the namespace in with the element. It won't show up in the XML that is output, but will be read by the fragment so that it can create the attribute without any errors.

$dom = new DOMDocument('1.0', 'utf-8');
$root = $dom->createElement('MyRoot');
$root->setAttributeNS('http://www.w3.org/2000/xmlns/','xmlns:xlink','http://www.w3.org/1999/xlink');
$dom->appendChild($root);

$tmp = $dom->createDocumentFragment();
$ok = $tmp->appendXML('<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="file123.ext"/>');
$dom->documentElement->appendChild($tmp);
die($dom->saveXML());

Output

<?xml version="1.0" encoding="utf-8"?>
<MyRoot xmlns:xlink="http://www.w3.org/1999/xlink"><graphic xlink:href="file123.ext"/></MyRoot>
slapyo
  • 2,979
  • 1
  • 15
  • 24
  • Hum... (thanks about the bug report 44773!) Yes, this is the way to workaround, I need to detect by a regular expression, creating a unsafe, slow and dirty code... I use [fragment in a generic function, see this one](http://stackoverflow.com/q/26029868/287948). – Peter Krauss Oct 27 '14 at 19:54
  • I think we can reopen PHP's bugs... The logic and util behaviour is to accept any namespace that has already been defined at `DomDocument` root ... Or offer a `setAttributeNS()` method for fragment contents (!). – Peter Krauss Oct 27 '14 at 19:59
  • It would make sense to have something like that. As far as I know, there is no mention of you being able to pass the namespace in the string other than in that bug report. – slapyo Oct 27 '14 at 20:01
  • See 2014-08-18 comment: "If a DocumentFragment does not know which namespaces it is defined for, the current behavior makes no sense". I agree. – Peter Krauss Oct 27 '14 at 20:07
2

I spent about 4 hours pulling my hair out over this and it appears if you turn off libxml errors, the warning goes away and you are free to use your prefixes without hassle.

libxml_use_internal_errors(true);

I agree that this is more a bug than by design, but this workaround saved the day for me.

Edit: This works if you don't need to go back and reference that fragment later in the script. Looks like suppressing the warning still leaves the items without a namespace, despite the prefix being present when you print the document.

David Tran
  • 104
  • 1
  • 5
1

This is not a bug, it really is the expected behavior. Namespaces are not defined on or for a XML document, but on element nodes. They are valid for this node and any children until redefined.

So if you create the document fragment it has no parent node and now you append some XML fragment. Looking up it can not find any definition for the namespace and you get an error. Depending on where in the document you're going to add it, the namespace prefix could be used for completely different namespaces.

You have to define the namespace in the fragment, if the fragment is generated by DOM it should always have all needed namespace definitions.

If you generate it as text, you can make sure that the namespace definition is included in the element that needs it or you can add a wrapper element node with all needed namespace definitions.

$dom = new DOMDocument();
$dom->loadXml('<foo/>');

$fragment = $dom->createDocumentFragment();
$fragment->appendXML(
  '<fragment xmlns:xlink="http://www.w3.org/1999/xlink">
     <graphic xlink:href="file123.ext"/>
   </fragment>'
);
foreach ($fragment->firstChild->childNodes as $child) {
  $dom->documentElement->appendChild($child->cloneNode(TRUE));
}

echo $dom->saveXML();

Output:

<?xml version="1.0"?>
<foo>
  <graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="file123.ext"/>
</foo>

It is possible to generate the wrapper element from a list of namespaces. The following function will take an element node or a list of namespaces [prefix => namespace].

function wrapFragment($namespaces, $xml) {
  if ($namespaces instanceOf DOMElement) {
    $xpath = new DOMXpath($namespaces->ownerDocument);
    $namespaces = $xpath->evaluate('namespace::*', $namespaces);
  }
  $result = '<fragment';
  foreach ($namespaces as $key => $value) {
    if ($value instanceOf DOMNamespaceNode) {
      $prefix = $value->localName;
      $xmlns = $value->nodeValue;
    } else {
      $prefix = $key == '#default' ? '' : $key;
      $xmlns = $value;
    }
    $result .= ' '.htmlspecialchars(empty($prefix) ? 'xmlns' : 'xmlns:'.$prefix);
    $result .= '="'.htmlspecialchars($xmlns).'"';
  }
  return $result.'>'.$xml.'</fragment>';
}

echo wrapFragment(
  $dom->documentElement, '<graphic xlink:href="file123.ext"/>'
);

Output:

<fragment xmlns:xml="http://www.w3.org/XML/1998/namespace" xmlns:xlink="http://www.w3.org/1999/xlink"><graphic xlink:href="file123.ext"/></fragment>
ThW
  • 19,120
  • 3
  • 22
  • 44
  • Hum... I not agree... How do you solve the "Typical problematic scenario", ~ explained in the question? If your `$dom` have a ``, the natural is fragment inherit the same namespaces, because fragment is not a "extraterrenal imported node", it was created from `$dom`: why do you not agree with this position? – Peter Krauss Oct 29 '14 at 19:31
  • The namespaces are not defined on the document but the document element node (or any other element node). You create nodes or fragments separately from the hierarchy. Before the append, you do not know which namespace the prefix is defined for. If you create an element or an attribute you need to provide the namespaces, same goes for the fragment. – ThW Oct 29 '14 at 20:07
  • Ok, your position is mathematically correct, and justifies the *default* behavior... But only the *default* [in our view](https://bugs.php.net/bug.php?id=44773) (see 2014's comments). A "bug" is to not offer an alternative, an "set namespaces" option, because many potential applications for fragments are lost (!) without minimal "friendly good practices". Examples of alternatives: a flag `createDocumentFragment($importRootNamespaces=false)`, a reference node `createDocumentFragment($refNamespacesNode=NULL)`, or a `DOMDocumentFragment::setAttributeNS()` method. – Peter Krauss Oct 30 '14 at 09:35
  • ... Imagine a generic application, as your FluentDOM (!), where an user offers a DomDocument's copy/pasted fragment to a `FluentDOM::replace_innerXML($node,$innerXML)` hypothetical method... How to manage it? Can you predict which namespace in `$innerXML`? Does it make sense, your `replace_innerXML` to inject, by complex regular expressions, all the possible namespaces in `$innerXML`? Example `$innerXML='nononono link1 nonon nonononono link2...'`. – Peter Krauss Oct 30 '14 at 10:00
  • I added an example function. As you can see it has not to be a part of the DOM API itself. Actually I don't think the namespaces should come from an element node, but an external definition. In FluentDOM, the document object HAS a namespace definition. These are not the namespace definition on the nodes, but for the API (like DOMXpath::registerNamespace). *imho* the namespace definitions for a fragment should come from the fragment or the application that provides the fragment. Prefixes of actual nodes, should not be relevant for your application logic - they can be from an external source. – ThW Oct 31 '14 at 09:40
  • Hum... Thanks (!), lets discuss the examples of your answer. 1) you not used a typical fragment (piece of XML without root), see my simple `$innerXML` above as real fragment sample. 2) the "PHP bug" is about *DOMDocument* solutions, not XML-string solutions. 3) your `wrapFragment($namespaces)` not solve the real problem, because need `$namespaces` as argument, in a real world (se my illustred problem) no *a priori* knowledge of namespaces exist... An interesting lib function is `getNamespaces($node)` (perhaps [this is the only algorithm](http://stackoverflow.com/a/2470433/287948)). – Peter Krauss Oct 31 '14 at 12:04
  • https://eval.in/214878 1) My fragment contained only an element node, but it works with a mixed fragment, too. I just add a wrapper element for the namespace definitions. 2) XML fragments are strings 3) My example allows for a $node or a list. For real world the knowledge of the namespaces HAS to exists in the application. You have to know what kind of data you're working with. What you should not need know are the prefixes used in the document. – ThW Oct 31 '14 at 12:35
  • For example you should know that you're reading an ATOM feed, but you should not need to know if it uses 'atom', 'a' or no prefix in the XML document. – ThW Oct 31 '14 at 12:37
  • hum... I think we not agree about *the point*... I edited the question for explain more and add more examples (please see there). About your item 2, the point is not "datatype string" but about "how to see or parse the XML string" (as DOM or as string), see [this explain](http://stackoverflow.com/a/8578999/287948) or [this](http://stackoverflow.com/a/1758162/287948) or [this discussion](http://stackoverflow.com/q/1732348/287948)... To use "pure DOM tools" is not only a personal preference. – Peter Krauss Oct 31 '14 at 14:07
  • The main point that I disagree is that the namespace definition should be from a node and/or from the document element by default. Namespace definitions have to come from the application side, because this is the part you control - not the document. Here is no reason to use RegEx for this anyway, if you wrap the fragment into a element node (with the namespace definition) the result is a valid XML document. *btw* one result of this discussion is that I now have a good idea how to implement it in FluentDOM. :-) – ThW Oct 31 '14 at 14:38
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/64036/discussion-between-thw-and-peter-krauss). – ThW Oct 31 '14 at 15:19
0

NOTES about the question and the relation with PHP lack of resources for fragment namespaces.

A PHP bug?

(after @slapyo)

There are a PHP bug#44773: is not a "good behaviour", is a bug (!). If you agree please add a comment there, and vote up here (!).


Imagine you using a replace_innerXML($node,$innerXML) function, or any other similar context... See "typical scenarios" section below.

How to workaround? Without a big regular expression (over $innerXML in the example) and slow algorithm, to set each tag without a namespace declaration... And, of course, after all, the appendXML() turns the fragment a component of the tree, so no namespace is need because it is already at the root... all work is only to use the buggy fragments's appendXML().

Typical scenarios

(after @ThW answer/discussion) Typical "blind namespace" fragment uses.

When the fragment is a "extraterrenal" with a new namespace, ok, the fragment need to declare by itself the used namespaces... But the problem exposed in this question IS NOT this exotic one, is a so commom other one.
PS: as we will se, the "PHP bug"-solution also is a solution to this context.

Here, to illustrate, there are two typical uses of fragments where no a priori knowledge about namespaces (used by of fragment's elements) exist, only the fact that all have been declared in the DOMDocument (not need to redeclare).

1) a XSLT call-to-PHP-returning-fragment by XSLTProcessor::registerPHPFunctions();

2) a "generic DOM library" that offer a handling method for replace the XML inner contents of a node by a new XML content, that can be a fragment-content. See function replace_innerXML() below.

function replace_innerXML(DOMNode $e, $innerXML='') {
    if ($e && ($innerXML>'' || $e->nodeValue>'')) {
        $e->nodeValue='';   
        if ($innerXML>'') {
            $tmp = $e->ownerDocument->createDocumentFragment();
            // HERE we need to INJECT namespace declarations into $innerXML
            $tmp->appendXML($innerXML);
            $e->appendChild( $tmp );
        }
        return true;
    }
    return false;
}
// This function is only illustrative, for other propuses 
//         see https://stackoverflow.com/q/26029868/287948
// Example of use:
$innerXML='nonoo <xx xx:aa="ww">uuu</xx> nono<a:yy zz:href="...">uu</a:yy>...';
replace_innerXML($someNode,$innerXML);

The algorithm "to INJECT namespace" (see comment in the function) would be simple if PHP features offered ... But, as we insist, PHP have a bug because offers nothing (!).

An elephantic solution

The only way (today 2014) to "INJECT namespace" is

$innerXML = preg_replace_callback(
   "/([<\s])($namespacesJoinByPipe):([^\s>]+)/s",
   function ($m) use($namespacesAssociative) {
     $nsdecl = "xmlns:$m[2]=\"".$namespacesAssociative[$m[2]].'"';
     return ($m[1]=='<')
       ? "<$m[1]$m[2]:$m[3] $nsdecl "    // tag like "<a:yy"
       : " $nsdecl $m[1]$m[2]:$m[3]";   // attribute like " xx:aa"
   },
   $innerXML
);

... So, is a big elephant to do a so simple thing: only to accept the pre-existent DOMDocument namespaces.

PHP have a bug because not avoid this "big elephant"... Solution to solve the "PHP bug"?

PHP's ideal solution

There are many alternatives for a PHP RCF to solve the problem: a flag createDocumentFragment($importRootNamespaces=false), a reference node createDocumentFragment($refNamespacesNode=NULL), or a DOMDocumentFragment::setAttributeNS() method... All with a default behaviour equivalent to usual createDocumentFragment() (no parameter).

PS: any of these solutions also help to handle other problems, like the first one comented, "... when the fragment is a 'extraterrenal' with a new namespace ...".

Community
  • 1
  • 1
Peter Krauss
  • 13,174
  • 24
  • 167
  • 304