0

I have a Java Spark application that retrieves data from a Website as follows:

while(true)
        {
            try{    
                connection = (HttpURLConnection) uRL.openConnection();
                /* optional default is GET */
                connection.setRequestMethod("GET");

                /* add request header */
                connection.setRequestProperty("User-Agent", USER_AGENT);
                connection.getResponseCode();
                connection.setReadTimeout(0);
                /* Read the response code */
                bufferedReader = new BufferedReader(new InputStreamReader(connection.getInputStream(), StandardCharsets.UTF_8));
                break;
            }
            catch(Exception e){
                LOGGER.error("Error in querying Wikipedia: "+e.getMessage());
                continue;
            }
        }
        response = new StringBuffer();
        while ((inputLine = bufferedReader.readLine()) != null) {
            response.append(inputLine);
            response.append("\n");
        }
        bufferedReader.close();

This code works well on Windows.

However, on a Centos machine which has an HTTP and HTTPs proxy server, it fails with Connection Timeout. I set the system Properties for the HTTPs Proxy for the application and make sure it works for some links. However, it doesn't work for some others. For those it doesn't work, I also tried the same URL using wget on the linux server and worked.
Link that doesn't work: https://ar.wikipedia.org/w/api.php?action=query&format=xml&titles=%D9%82%D8%B1%D9%89&redirects&prop=pageprops|categories&cllimit=500
link that works: https://ar.wikipedia.org/w/api.php?action=query&format=xml&list=allpages&apnamespace=14&apfilterredir=nonredirects&aplimit=500

zero323
  • 322,348
  • 103
  • 959
  • 935
fattah.safa
  • 926
  • 2
  • 14
  • 36

2 Answers2

1

Java doesn't necessarily respect your system's default proxy settings. Since you are able to "curl" the URL on the Linux machine, the most likely explanation is that Java is not using the proxy that you have configured. The following links explains various ways to configure the proxies for Java:

Community
  • 1
  • 1
Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • Thanks for your comment. I already mentioned that it works for some links as I already set the Proxy inside the application. – fattah.safa Apr 29 '17 at 15:05
  • I'm afraid that what you wrote in your question is unclear. In particular, you DO NOT say that the proxy is working in your Java application for some URLs and not others. What is the difference between the URLs? – Stephen C Apr 29 '17 at 15:23
  • Actually I mentioned that. Anyway, I just updated the question. – fattah.safa Apr 29 '17 at 15:41
  • I will repeat what I said. Your question was unclear. The title says that the code works on Windows and fails on Linux. Then you said *"I set the system Properties for the HTTPs Proxy for the application and make sure it works"*. Obviously, since your title says that "it doesn't work on Linux", that is a contradiction. Unclear. Your update now makes it clearer, but I don't have time to help you right now. – Stephen C Apr 29 '17 at 16:15
-1

I'm using Ubuntu and It worked with me

    try {
        URL obj = new URL(url);
        HttpURLConnection con = (HttpURLConnection) obj.openConnection();
        con.setRequestMethod("GET");

        //add request header
        int responseCode = con.getResponseCode();

        BufferedReader in = new BufferedReader(
                new InputStreamReader(con.getInputStream()));
        String inputLine;
        StringBuffer response = new StringBuffer();
        while ((inputLine = in.readLine()) != null) {
            response.append(inputLine);
        }
        in.close();
        System.out.println(response.toString());
    } catch (MalformedURLException e) {
        e.printStackTrace();
    } catch (ProtocolException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }