0

I am trying to execute a crawler program from my office. A very basic one which is available in internet and which works fine in my home PC. However while I am trying to run the same program in my office PC i am getting connect timed out error. I thought it was proxy problem and tried accessing some site from eclipse internal browser and it worked fine also.

 Document doc = Jsoup.connect("http://flipkart.com/").timeout(0).get(); 

Please find below my stack trace

Exception in thread "main" java.net.ConnectException: Connection timed out: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.<init>(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:449)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:434)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:181)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:170)
at org.syntel.crawler.Crawler.processPage(Crawler.java:44)
at org.syntel.crawler.Crawler.main(Crawler.java:20)

How can I fix this problem?

Jeffrey Bosboom
  • 13,313
  • 16
  • 79
  • 92
KRam1802
  • 31
  • 1
  • 11
  • try setting a user agent. if you are using a proxy check this http://stackoverflow.com/questions/7482748/how-to-add-proxy-support-to-jsoup-html-parser – Alkis Kalogeris Mar 17 '15 at 08:22
  • @alkis, You answer many questions in comments, why not post them as real answers ? – Jonas Czech Mar 26 '15 at 20:27
  • Hi @JonasCz These kind of questions have been answered before in SO. Some of them I've answered myself (the user agent part). I wasn't sure though, that this was the case with this one. If I was sure, I would have marked it as a duplicate instead of placing a comment. – Alkis Kalogeris Mar 27 '15 at 05:26
  • I have moved the solution to an answer. You can accept that answer now by checking the box provided. – Brian Tompsett - 汤莱恩 Apr 30 '15 at 20:28

2 Answers2

0

@alkis made the suggestion:

Try setting a user agent. ff you are using a proxy check this other question: How to add proxy support to Jsoup (HTML parser)?

Community
  • 1
  • 1
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
  • ([Answered in comments and converted to a community wiki.](http://meta.stackoverflow.com/questions/251597/question-with-no-answers-but-issue-solved-in-the-comments)) – Brian Tompsett - 汤莱恩 Apr 30 '15 at 20:27
  • This is not a high quality answer. Please refrain from resurrecting low-quality questions to repost comments as low-quality answers. – nobody May 03 '15 at 13:05
0

Try using:

System.out.println("Testing JSOUP\n--------------");
Proxy proxy = new Proxy(                                      //
        Proxy.Type.HTTP,                                      //
        InetSocketAddress.createUnresolved("www.yourPROXY.com", 80) //
);
Document doc = Jsoup.connect("http://en.wikipedia.org/").proxy(proxy).get();
Elements newsHeadlines = doc.select("#mp-itn b a");
System.out.println(newsHeadlines.html());
Om Sao
  • 7,064
  • 2
  • 47
  • 61