I'm writing a web scraper in Java but I'm behind a proxy server and it's making things very difficult.
This is the connection code:
public void scrape(String url, String filename) throws Exception {
this.url = url;
this.filename = filename;
System.out.println("Scraping " + url);
System.out.println("Saving to \"" + this.filename + "\"");
try {
makeConnection();
createStream();
writeToFile();
System.out.println("Scrape was successful");
} catch (Exception e) {
System.err.println("Error: " + e.getMessage());
}
}
private void makeConnection() throws Exception {
// Set proxy info
System.setProperty("java.net.useSystemProxies", "true");
URL address = new URL(url);
connection = address.openConnection();
}
This is the output:
Scraping http://feeds.bbci.co.uk/news/northern_ireland/rss.xml
Saving to "../rss/northern_ireland.xml"
Error: Connection timed out
Is there a better way of setting the proxy settings?