I'm scraping the source code of a website.
My first print prints out the complete source code.
Then the second print prints an actual DOM to the console, but for some reason the contents of the document change just slightly.
A thing that bugs me is that the <body>
tag goes missing and I have no idea why.
I just realized the <head>
tag goes missing as well. So there might be a good reason for it.
TO CLARIFY: The content of both the <head>
and <body>
tags remain together in the container. Just the tags themselves disappear, not their content.
I want the whole source code to be parsed into an accessible DOM.
This is the code:
$.ajax({url: url, dataType: "text", success: function(data) {
console.log("data:", data);
var htmlDocument = $("<html>").html(data)[0];
console.log("htmlDocument:", htmlDocument);
}});
I am new to JavaScript, thank you for any help. I am eager to understand the issue but for now I really just want it to work.