My JavaScript app retrieves a webpage by XHR then parses it like this:
var el = document.createElement( 'html' );
el.innerHTML = xml;
var links = el.getElementsByTagName( 'a' );
In the process, the links' href
tags get reinterpreted as relative to this document, so I get links like http://localhost:8000/download.zip
.
I tried hacking my way around it:
if (link.origin === document.origin) {
link.href = link.href.replace(link.origin, h.url.replace(/\/$/, ''));
}
But that can't distinguish between foo.org/bar
(foo.org/bar/download.zip) and foo.org/bar.php
(foo.org/download.zip), and I don't really want to go down the rabbit hole of working out exactly what substitutions to perform.
I tried injecting either a <base href=...>
or <xml:base=xxx>
into the document, but that didn't work.
What am I missing? This seems like a common enough need?
I'm not using any jQuery or anything similar (and can't.)