33

I am using the following code to parse a string into DOM:

var doc = new DOMParser().parseFromString(string, 'text/xml');

Where string is something like <!DOCTYPE html><html><head></head><body>content</body></html>.

typeof doc gives me object. If I do something like doc.querySelector('body') I get a DOM object back. But if I try to access any properties, like you normally can, it gives me undefined:

doc.querySelector('body').innerHTML; // undefined

The same goes for other properties, e.g. id. The attribute retrieval on the other hand goes fine doc.querySelector('body').getAttribute('id');.

Is there a magic function to have access to those properties?

Camelid
  • 1,535
  • 8
  • 21
DADU
  • 6,482
  • 7
  • 42
  • 64

3 Answers3

56

Your current method fails, because HTML properties are not defined for the given XML document. If you supply the text/html MIME-type, the method should work.

var string = '<!DOCTYPE html><html><head></head><body>content</body></html>';
var doc = new DOMParser().parseFromString(string, 'text/html');
doc.body.innerHTML; // or doc.querySelector('body').innerHTML
// ^ Returns "content"

The code below enables the text/html MIME-type for browsers which do not natively support it yet. Is retrieved from the Mozilla Developer Network:

/* 
 * DOMParser HTML extension 
 * 2012-02-02 
 * 
 * By Eli Grey, http://eligrey.com 
 * Public domain. 
 * NO WARRANTY EXPRESSED OR IMPLIED. USE AT YOUR OWN RISK. 
 */  

/*! @source https://gist.github.com/1129031 */  
/*global document, DOMParser*/  

(function(DOMParser) {  
    "use strict";  
    var DOMParser_proto = DOMParser.prototype  
      , real_parseFromString = DOMParser_proto.parseFromString;

    // Firefox/Opera/IE throw errors on unsupported types  
    try {  
        // WebKit returns null on unsupported types  
        if ((new DOMParser).parseFromString("", "text/html")) {  
            // text/html parsing is natively supported  
            return;  
        }  
    } catch (ex) {}  

    DOMParser_proto.parseFromString = function(markup, type) {  
        if (/^\s*text\/html\s*(?:;|$)/i.test(type)) {  
            var doc = document.implementation.createHTMLDocument("")
              , doc_elt = doc.documentElement
              , first_elt;

            doc_elt.innerHTML = markup;
            first_elt = doc_elt.firstElementChild;

            if (doc_elt.childElementCount === 1
                && first_elt.localName.toLowerCase() === "html") {  
                doc.replaceChild(first_elt, doc_elt);  
            }  

            return doc;  
        } else {  
            return real_parseFromString.apply(this, arguments);  
        }  
    };  
}(DOMParser));
Rob W
  • 341,306
  • 83
  • 791
  • 678
  • 3
    PS. For clarification, when you're using `text/xml`, `doc` is an instance of `XMDocument`. Using `text/html`, it's an instance of `HTMLDocument`. – Rob W Feb 12 '12 at 18:03
  • Waaw, quite a useful answer! Couldn't have found that one myself. Just the mime type and enabling that mime type :) – DADU Feb 12 '12 at 18:45
  • 1
    @RobW I assume you mean `XMLDocument`. – devios1 Apr 30 '12 at 21:36
  • Thanks @RobW. This was useful for the reverse process where one was able to use regex to edit a text string to add html and then build a replacement node [avoiding innerHTML](http://stackoverflow.com/a/15535762/1308424) Your solution worked perfectly! – Mike Wolfe Mar 20 '13 at 22:20
  • The solution is not bad but: - Why are you using the coma operator and not just 3 instructions? this option is more "obscure" and does not add any advantage. Further more, the first_elt use create a global variable in the window scope (what is soo bad). – Adrian Maire Mar 30 '13 at 20:16
  • @AdrianMaire All variables are local, the comma's are part of the [`var`](https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Statements/var) statement, not the comma operator. Note that I didn't write the code, and the code is not perfect. For instance, in Internet Explorer 9-, the code will fail because `document.documentElement.innerHTML` is read-only. – Rob W Mar 30 '13 at 21:28
  • Very interesting, thanks you for the clarification. As far as I know, document.implementation.createHTMLDocument is only available since IE9, so this code does not support any IE? Do you know any equivalent for IE6-8? Anyway, it still useful for "complient" browsers like firefox. – Adrian Maire Mar 30 '13 at 22:47
  • @AdrianMaire It really depends on your purposes. A few months ago, I've written a `toDOM` method for the major 5 browsers, which should *not* load external resources. Initially, I tried to use a hidden iframe. The DOM is parsed well, but external resources are obviously loaded. In the end, I ended up using `document.createElement('html')` plus additional expando methods. And unless a significant part of your clients uses IE6/7, I strongly recommend to drop support for these browsers. Practically no-one uses them any more. Some argue that even IE8 may be ignored... – Rob W Mar 30 '13 at 22:54
  • @RobW : Do you know where I can find a list of browsers which support`DOMParser().parseFromString()`but not with the`text/html`mime type? – user2284570 Dec 21 '14 at 21:20
  • @user2284570 All modern desktop browsers support this feature, including Chrome 30+, Firefox 12+, IE 10+, Opera 17+ and Safari 7.1+. This information is also available at MDN: https://developer.mozilla.org/en-US/docs/Web/API/DOMParser#Browser_compatibility – Rob W Dec 21 '14 at 21:45
  • @RobW : Currently, I’concerned by Opera Presto *(seems to work with XML documents)* (Blink versions can be considered as downgrades); and IE8/IE6, since many companies force their Users to use the preinstalled browser of XP *(the same way, they were still using 2000 in 2011)*. – user2284570 Dec 21 '14 at 22:41
  • @user2284570 `DOMParser`+`text/html` is not supported by Presto. The polyfill works almosy flawlessly though (one notable difference: With the polyfill, if you have `` in the HTML, then the image will be loaded). IE6-8 do not support `createHTMLDocument`, and I don't see why you want to support them, since Windows XP is already end-of-life (so no reason to use IE8 and definitely not IE 6). There is no solid alternative to `DOMParser`+`text/html`. You could use `document.createElement` and assign HTML to it (any external content (``, styles, .) will also be parsed and loaded). – Rob W Dec 21 '14 at 22:44
  • @RobW :`Windows XP is already end-of-life` Yes but some companies have even chosen to not upgrade. It may represent 8% of users *(I can’t check since logging with a proxy can be considered as an indirect law requirement for computers services here)*. – user2284570 Dec 21 '14 at 23:18
  • @devios1 [it used to return a Document instance](https://www.w3.org/TR/DOM-Level-3-Core/core.html#Level-2-Core-DOM-createDocument) though. – Knu Nov 29 '16 at 04:31
3

Try something like this:

const fragment = document.createRange().createContextualFragment(html);

whereas html is the string you want to convert.

dude
  • 5,678
  • 11
  • 54
  • 81
  • Yes, this is the best solution if you want also execute scripts, like: http://stackoverflow.com/a/58862506/890357 – marciowb Nov 14 '19 at 19:27
0

Use element.getAttribute(attributeName) for XML/HTML elements

FelixSFD
  • 6,052
  • 10
  • 43
  • 117