I am having trouble converting a string containing an entire HTML document into a DOM object where I can use jQuery find()
. The input starts with the DOCTYPE string, and is in variable data
.
This works and finds a dozen tr
elements:
var dummy0 = $(data).find('tr');
This does not work, although there definitely is a h1
element:
var dummy1 = $(data).find('h1');
If I do this to inspect the data
object:
var dummy2 = $(data);
in the Firefox F12 Debugger it appears that dummy2
is an array of the following objects:
#text
title
#text
h1
#text
table
#text
So find('tr')
worked because it is found within the table
array element, but find('h1')
is not found because it is not inside a DOM but one of the array elements of data
.
I tried the trick of https://stackoverflow.com/a/11047751/1845672 but that results in exactly the same array instead of a single DOM tree.
I tried also $.parseHTML(data)
with the same result.
Can anyone help me explain how this all works? The input is a string with exactly ONE html element but is parsed to an array of a bunch of elements. Where are the head
and body
elements?
Then, because I need the content of the h1
, how do I get the DOM object that can be searched with find
for all elements including h1
?
Or am I forced to forget about DOM trees and just inspect the array element for h1
?
Update:
I created a small stand-alone test case:
<!DOCTYPE html>
<html>
<head>
<title>Test</title>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
</head>
<body>
<h1>Test</h1>
<script>
var sHtml = "<!DOCTYPE html>\n<html><head><title>test</title></head>\n<body><h1>Test</h1><table><tr><td>Abc</td></tr></table></body></html>";
var dHtml = $(sHtml);
var h1 = dHtml.find('h1');
var td = dHtml.find('td');
alert('h1: ' + (h1.length == 0 ? 'not found' : h1.text()) + //output: h1: not found
' td: ' + (td.length == 0 ? 'not found' : td.text()) + //output: td: Abc
' dHtml.length: ' + dHtml.length); //output: dHtml.length: 5
</script>
</body>
</html>
It appears that the two #text
entries in the array value of dHtml
correspond to the two newlines, one after the DocType, the other after the head
tag. Still wondering why there is not one, but three DOM entries in dHtml.