0

In my Chrome Extension I have a content script that runs and outputs the page's Initial Source HTML. (As seen in VIEW SOURCE) by simply using document.documentElement.innerHTML.

But what I need is the Generated Source (Source/Current DOM after all JavaScript has finished executing [As seen in INSPECT ELEMENT]), I've read various websites, and SO questions relative to this but they only discussed this in terms of a request from an outside source and not a chrome extension. Some of the various options I've read were:

  • Run the url through a virtual browser hosted on your server to see how a browser would interpret the source and return the generated source
  • Scrape the pages initial source, and somehow listen for and record all JavaScript executions, then execute those commands on the initial source to try and re-create the generated source

Since a Chrome Extension's content scripts presumably run parallel within the open tab's page is there some simpler/more-efficient solution? Can I just wait for all the initial JavaScript to finish executing then grab the current DOM or Source?

NOTE: I don't need to keep track of DOM after any extra JavaScript commands are executed. I just need one snapshot after the JavaScript that executes on every page load is finished.

I apologize in advance if this sounds naive, I'm new to making chrome extensions. Any links to good resources, tutorials, or examples would be greatly appreciated.

Thank you for your time.

Devon Bernard
  • 2,250
  • 5
  • 19
  • 32
  • When is your content script run/injected (i.e. at what phase of page loading) ? What makes you think that `document.documentElement.innerHTML` does not return the "generated source" ? Have you encountered any case where it returned something different ? – gkalpak Nov 04 '13 at 06:07
  • My content script runs at "document_end" (after all the static HTML is finished loading). This causes almost any dynamically loaded content to load after my content script is called. I currently made my code work by adding a listener and waiting for the element I'm looking for to be loaded... but this does slightly slow down the page from loading; I'm just wondering if there is a more efficient way. – Devon Bernard Nov 04 '13 at 06:12
  • Also to your specific point referencing generated source... Since the best manifest `run_at` attribute I could find for this situation was `document_end` I had to pick that one. And since `document_end` is called between when the static HTML is finished loaded and the slow processes like loading images and JavaScript function calls start... the DOM "snapshot" is at the initial condition and not after the proper content is generated. Even though technically I can use the same call to get generated content, I just need to wait for that content to load before my get call. – Devon Bernard Nov 04 '13 at 06:18
  • How is "document_end" the "best value" for the `run_at` attribute ? The default "document_idle" seems a better fit. In any case, if you want to be able to wait for dynamicly, asynchronously loaded content, then there isnoother solution than listening to DOM changes. I highly doubt though that (if done properly) this would slow down a pages loading process./ (as you seem to claim). – gkalpak Nov 04 '13 at 06:34
  • I did consider document_idle but depending on the load times of various pages it seemed to execute at various times; meaning sometimes it executed after some particular element was loaded or sometimes before. With my old implementation I think the slow-down could have been because I forgot to add a setInterval to wait a good amount of time between checks. Either way after reading some more examples I think http://stackoverflow.com/questions/13917047/how-to-get-a-content-script-to-load-after-a-pages-javascript-has-executed had an efficient solution. – Devon Bernard Nov 04 '13 at 06:44
  • 1
    This sure is a viable solution. Nevertheless, I would probably go for "document_start" and uee a `MutationObserver` instead of the less efficient `setInterval` (but of course it all depends on your specific requirements). – gkalpak Nov 04 '13 at 07:38

0 Answers0