I know this question is old, but I ran into the same issue while playing with a document fragment because I didn't realize that I had to append a div to it and use the div's innerHTML
to load strings of HTML in and get DOM Elements from it. I've got other answers on how to do this sort of thing, better suited for whole documents.
In firefox (23.0.1) it appears that setting the innerHTML property of the document fragment doesn't automatically generate the elements. It is only after appending the fragment to the document that the elements are created.
To create a whole document use the document.implementation
methods if they're supported. I've had success doing this on Firefox, I haven't really tested it out on other browsers though. You can look at HTMLParser.js in the AtropaToolbox for an example of using document.implementation
methods. I've used this bit of script to XMLHttpRequest
pages and manipulate them or extract data from them. Scripts in the page are not executed though, which is what I wanted though it may not be what you want. The reason I went with this rather verbose method instead of trying to use the parsing available from the XMLHttpRequest
object directly was that I ran into quite a bit of trouble with parsing errors at the time and I wanted to specify that the doc should be parsed as HTML 4 Transitional because it seems to take all kinds of slop and produce a DOM.
There is also a DOMParser
available which may be easier for you to use. There is an implementation by Eli Grey on the page at MDN for browsers that don't have the DOMParser
but do support document.implementation.createHTMLDocument
. The specs for DOMParser
specify that scripts in the page are not executed and the contents of noscript tags be rendered.
If you really need scripts enabled in the page you could create an iFrame with 0 height, 0 width, no borders, etc. It would still be in the page but you could hide it pretty well.
There's also the option of using window.open()
with document.write
, DOM methods or whatever you like. Some browsers even let you do data URI's now.
var x = window.open( 'data:text/html;base64,' + btoa('<h1>hi</h1>') );
// wait for the document to load. It only takes a few milliseconds
// but we'll wait for 5 seconds so you can watch the child window
// change.
setTimeout(function () {
console.log(x.document.documentElement.outerHTML);
x.console.log('this is the console in the child window');
x.document.body.innerHTML = 'oh wow';
}, 5000);
So, you do have a few options for creating whole documents offscreen/hidden and manipulating them, all of which support loading the document from strings.
There's also phantomjs, an awesome project producing a headless scriptable web browser based on webkit. You'll have access to the local filesystem and be able to do pretty much whatever you want. I don't really know what you're trying to accomplish with your full page scripting and manipulation.