I have a backup website. Something like Wayback machine. When I return the contents of the HTML, obviously, the linked documents (like images, javascript files, css files, etc.) are loaded from the original web server (instead of my server). Now I want to replace those links so that they are loaded from my server. I have two approaches to take:
- Do it server-side using Java or PHP. I can use both Java and PHP to do this. For instance in Java, I could use jSoup to parse the HTML and replace the links.
- Do it client-side using jQuery.
Using 2nd method means I don't have to add load on my server to parse the HTML but I think, as soon as the page is being loaded, the files will begin to download from the original server and the user's bandwidth would be wasted.
On the other hand, if I could somehow determine whether the image has been successfully downloaded, I could skip the download from my server and let the user use the file downloaded from the original server.
What is your suggestion for this?
Update
About relative and absolute links I should do some clarifications. The links on my service are stored as absolute paths. However, the HTML documents may have both types of links. What I need to do is:
- Convert
http://stackoverflow.com/images/image.png
tohttp://mysite.com/view/content?url=http://stackoverflow.com/images/image.png
- Convert
/images/image.png
(on thehttp://stackoverflow.com
) tohttp://mysite.com/view/content?url=http://stackoverflow.com/images/image.png
In short, the relative links on the HTML should be converted to absolute links and then be sent to my website as the URL
argument.