I'd like to write a program that lets me see the URLs to all the necessary files needed to load any particular web page (images, css files, javascript files, etc.). Basically I want a program that can generate the list of files Chrome's Network tab in Developer Tools can make (or Firefox's Firebug plugin).
wget would be the easy answer, but it doesn't seem to execute javascript, which can often result in further dependencies (e.g. by drawing an image tag to the document). I'm wondering if Python's webkit module could help. It can fully render a web page, so it must at some point know how to find all the dependencies.
I came across this method for executing javascript from Python, but the result on Google.co.uk is an html page with an empty body. http://blog.motane.lu/2009/06/18/pywebkitgtk-execute-javascript-from-python/
Is there a way to tap into the inner workings of the webkit module to get a list of all the files it used to render a page? Anyone have any other suggestions?
Cheers!