I am using the DOMParser() to create a new Document Object Model from a remote webpage so that I can make an in-browser web scraper. This is the code that I have to retrieve the remote webpage.
function getDOM(url){
// Variables
let dom = new Document;
let xhr = new XMLHttpRequest();
let cors = "https://cors-anywhere.herokuapp.com/"
// Create and send the XMLHttpRequest
xhr.open('GET', cors + url, true);
// xhr.responseType = 'document'; // This was necessary but then i changed
// the way i was accomplishing things and it isnt necessary anymore, I do not
// know why.
// This loads the xhr request into a new DOMParser
xhr.onreadystatechange = function () {
if (xhr.readyState === 4) {
dom = new DOMParser().parseFromString(xhr.responseText, 'text/html')
// This line of code is capable of printing the value of the "productTitle"
// element to the console, but if i try this same line of code outside of
// this if statement it returns an error saying "cannot read
// properties of null"
console.log(dom.getElementById('productTitle').innerText);
}
}
// This sends the request to the specified url
xhr.send(null)
//Returns the new DOMParser to the product class
return dom;
}
The issue is that i cannot manipulate or access any of the contents of the "dom" object outside of the if statement that the "dom = new DOMParser().parseFromString(xhr.responseText, 'text/html')" exists in. Shouldn't the parsed document be assigned to the "dom" element and accessible in the rest of the function? I am accessing the data successfully in the if statement through the "dom" object that I have assigned the results of the DOMParser() to.
Edit: Though this question Link to another question that is similar is indeed trying to solve the same technical problem that I am having, the solutions given use AJAX, jQuery, and other technologies that I am not using. I am only using plain javaScript and therefor need a solution that is not given in that question.