This is a follow-up to my popular and technically challenging HTML injection into someone else's website? question.
To recap: I'd like to demo my technology to customers without actually modifying their live website (e.g. demoing the idea of Stackoverflow financial bounties, without modifying the live site). Essentially, I'm trying to create a server-side version of Greasemonkey.
I've implemented the mirror as follows:
- A request comes into
http://myserver.com/forward?uri=[remote]
- My server opens a connect to
[remote]
, pulls down the data and returns its body/headers to the request from #1.
I chose this syntax because I needed to handle requests from multiple domains (meaning, if stackoverflow.com
links to meta.stackoverflow.com
I need to handle both domains from the same forwarding server).
I have managed to rewrite links in the HTML and CSS files so they are relative to my server. The final hurdle is rewriting URLs referenced by Javascript files.
What is the best way to programmatically rewrite URLs referenced by someone else's Javascript code? Is this even technically doable?
Discussion
I'll give you an example of the technical hurdle I am facing. Take http://www.honda.com/ for example. They embed a Flash element on the page, but instead of embedding <object>
directly, they use Javascript to document.write()
the <object>
tag containing the URL.
First attempt
- Use https://stackoverflow.com/a/14570614/14731 to listen for DOM change events. Instead of trying to rewrite URLs in the Javascript code, wait for it to modify the DOM and rewrite the URLs on the fly.
- Intercept all XmlHttpRequest requests using https://stackoverflow.com/a/629782/14731
Ideally we want intercept DOM changes before they render, so the browser does not request URLs before we have a chance to rewrite them.
Related resources:
Second attempt
A server-side solution will not work. Even if I can rewrite all DOM URLs, I've seen an example where an embedded Flash application references URLs stored in Javascript variables. There is no programmatic way to detect that these variables represent URLs because the Flash application is opaque.
Next, I plan on trying a client-side solution. I will load the original website in one frame, and manipulate its contents using Javascript in a second (hidden) frame. I hope to be able to inject new DOM elements (to demo my product) without having to rewrite the existing elements.