QUICK NOTE: I would strongly prefer any solutions presented to be implemented via pure JavaScript and/or HTML changes. I don't have anything against JQuery or any other library, framework, third-party tool, etc. personally, but I'm more interested in learning and improving rather than applying the quickest fix without an understanding of what's going on.
A quick description of what I'm trying to accomplish here is that I'm working with a page displaying information about chat sessions on a website. Information is recorded in a table format with some basic information (name, location of user, date, etc.) and in one of the columns for each entry is a link to another page where the chat transcript can be viewed. I've been asked to create a button that, when clicked, goes through all the chat records on that particular page, collects all of this transcript data, and exports the result to a .csv file. I tried this a few different ways and the only one that's worked correctly thus far is to loop through the table via the class name attached to the link I've described above, open an invisible iframe, and get the text data from the iframe. It doesn't seem like the most efficient solution so even though this isn't my question, if anyone has a different way of doing this I'm certainly open to ideas.
The function I'm currently using looks like this:
async function getFileContents() {
var viewLinks = document.getElementsByClassName('view-link');
var output = "";
for(var i = 0; i < viewLinks.length; i++) {
await new Promise(function(resolve, reject) {
var dataWindow = document.createElement("iframe");
dataWindow.setAttribute("src", viewLinks[i].href);
dataWindow.setAttribute("base", "target = _parent");
dataWindow.style.display = "none";
document.body.appendChild(dataWindow);
dataWindow.onload = function() {
var iframe = dataWindow.contentDocument;
var transcriptTextHeader = iframe.querySelector(".transcript-text").textContent;
var transcriptText = iframe.querySelector('#transcript').textContent;
var formattedText = `${transcriptTextHeader} ${transcriptText} \n`;
output += formattedText;
resolve(output);
document.body.removeChild(dataWindow);
}
});
}
download(output, "testoutput.csv");
return output;
}
Everything works correctly except for one problem that I've yet been able to figure out: the header text (with querySelector("transcript-text)) works fine, but for some reason the transcript text itself, retrieved via the following line, is never pulled. Headers look fine in the output file but there's no text underneath any of them. I've tried everything I can think of but nothing seems to access the text at all. For reference, here is a short skeleton of the section of the HTML structure on the transcript viewing page I'm opening in the iframes.
<div class = "transcript-data">
<!-- content here -->
<div class = "transcript-text">
<!-- header content is retrieved from here -->
<pre id = "transcript">
<!-- transcript text is here but not retrieved properly -->
</pre>
</div>
</div>
My only two thoughts were the following:
The function isn't capable of retrieving multiple elements at once. I updated the selectors to try only the transcript text and it still didn't work, so this doesn't seem to be the case.
The section is not entirely loaded when the retrieval is done. This doesn't seem to be the case either since its parent element is retrieved correctly. Getting output at the wrong time was an issue earlier with this task, which is why I updated it to be asynchronous. The output gathering works correctly in terms of the order in which events occur.
When I look through the Chrome console and select the DOM elements of each iframe I can see the text properly, it's just not being pulled properly even though its parent seems to have no problem. If anyone has ideas as to what might be occurring here, any insight would be greatly appreciated.