Adding some more information here in case it helps others.
Firstly, the basic technique given in other answers is correct: when you get an HTTP 403 error from a Java program (such as an XML parser) that is attempting to access an HTTP resource, but typing the same URI into your web browser is successful, then you may need to set up request headers that mislead the site into thinking that the request is coming from a browser.
One current example I've found where this is happening is the schema at https://www.musicxml.org/xsd/xml.xsd
If there's a single file you need, and you are invoking the parser for that file directly, then you can create an InputSource "by hand" and pass it to the XML parser
Assuming that what you are doing is parsing XML, then you can follow the code suggested by @zsbappa:
URLConnection connection = new URL(uriString).openConnection();
connection.setRequestProperty("User-Agent",
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11");
connection.connect();
InputSource inputSource = new InputSource(connection.getInputStream());
But if you're reading the file via an XSLT processor such as Saxon, or if the file contains references to other files that the XML parser also needs to read (for example DTDs, external entities, or schema documents) then it's not quite so easy. What you need to do in such case is to configure an EntityResolver
on the parser. It will typically look something like this:
xmlReader.setEntityResolver((publicId, systemId) -> {
if (systemId.startsWith("http:")) {
URLConnection connection = new URL(systemId).openConnection();
connection.setRequestProperty("User-Agent",
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11");
connection.connect();
return new InputSource(connection.getInputStream());
} else {
return null;
}
});
If you're calling Saxon and Saxon is calling the XML parser, you can supply your EntityResolver
to Saxon either as an option on the Transform
command line (-er:classname
) or as an option on the Saxon Configuration
. For example:
transformerFactory.setAttribute(
FeatureKeys.ENTITY_RESOLVER_CLASS, MyEntityResolver.class);