0

I all but copied the following code from here. I get a java.net.SocketException on line 10 saying "Connection Reset".

import java.net.*;
import java.io.*;
import org.apache.commons.io.*;

public class HelloWorld {
    public static void main(String[] x) {
        try {
            URL url = new URL("http://money.cnn.com/2013/06/07/technology/security/page-zuckerberg-spying/index.html");
            URLConnection con = url.openConnection();
            InputStream in = con.getInputStream();
            String encoding = con.getContentEncoding();
            encoding = encoding == null ? "UTF-8" : encoding;
            String body = IOUtils.toString(in, encoding);
            System.out.print(body);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

I'm worried this may not actually be an issue with the actual code but rather some permission I need to give Java. Is there something wrong with my code or is this an environment issue?

Community
  • 1
  • 1
Jake
  • 747
  • 1
  • 5
  • 19
  • Which OS? Which environment? (hint: in corporate environments, it is very likely that you won't be allowed direct connections to web servers, and that you'll have to go through a proxy) – fge Jun 11 '13 at 15:09
  • I agree @fge, you may also try to modify the User-Agent as some proxies or IDS systems blocks request that may come from a bot. Try to add before something like `System.setProperty("http.agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1468.0 Safari/537.36");` – gma Jun 11 '13 at 15:15
  • I'm on ubuntu 12. I'm using direct connections (unless python is doing something behind the scene for me) in a number of other projects. I've tried a number of sites and nothing seems to work. – Jake Jun 11 '13 at 15:19
  • I tried the System.setProperty suggestion. I placed that exact call just before I declared the URL variable. It seems to not have changed anything. As another note I have an actual bot(apache nutch) which is written in Java crawling wikipedia articles as we speak. – Jake Jun 11 '13 at 15:22

1 Answers1

0

I used your code with small modification cause I don't have IOUtils at hands. And it works as it should. There is no need to set agent. No special privileges also as I run it by normal user.

    try {
        URL url = new URL("http://money.cnn.com/2013/06/07/technology/security/page-zuckerberg-spying/index.html");
        URLConnection con = url.openConnection();
        InputStream in = con.getInputStream();
        BufferedReader br = new BufferedReader(new InputStreamReader(in));
        StringBuilder sb = new StringBuilder();
        String line = br.readLine();
        while (line != null) {
            sb.append(line);
            line = br.readLine();
        }
        System.out.print(sb.toString());
    } catch (Exception e) {
        e.printStackTrace();
    }
INeedMySpace
  • 325
  • 4
  • 12
  • This code dosn't change anything for me however. As soon as I get to 'con.getInputStream()' I get the exception. This seems to me to confirm that there is an configuration difference in our systems causing this. – Jake Jun 11 '13 at 15:55
  • could you run - wget http://money.cnn.com/2013/06/07/technology/security/page-zuckerberg-spying/index.html in console? what is in index.html? – INeedMySpace Jun 11 '13 at 15:58
  • It downloads it perfectly without issue – Jake Jun 11 '13 at 18:38
  • I would advice 2 more actions - check the libs are up-to-date and you have no old libraries in CLASSPATH and the second is to use network analyser like tcpdump or wireshark (is better) to see what happens on network layer. You also can run web server locally and try local connections (and see what difference is in packets) – INeedMySpace Jun 12 '13 at 13:57