-2

I'm trying to connect to a url using JSoup. It works fine for one url but for another I get

java.net.SocketException: Connection reset

Here is the code that works:

Document doc = Jsoup.connect("https://finance.yahoo.com/quote/IBM/key-statistics?p=IBM").get();

Here is code that generates the error:

Document doc = Jsoup.connect("https://www.nasdaq.com/dividend-stocks/dividend-calendar.aspx?date=2018-Aug-17").get();

If you cut and paste those urls into a browser, they work fine. Suggestions?

user3217883
  • 1,216
  • 4
  • 38
  • 65
  • Isn't this due to an issue at the server end? – Hovercraft Full Of Eels Aug 04 '18 at 21:07
  • Probably. Is there nothing I can do about it? Why would it work from a browser but not from code? – user3217883 Aug 04 '18 at 21:09
  • 1
    Some sites don't want to support non-browser visitors so they check for headers like user-agent. I didn't test it but I would probably start from something like [JSoup UserAgent, how to set it right?](https://stackoverflow.com/q/6581655) (but I would probably use my browser header which can be found at places like http://www.whatsmyua.info/) – Pshemo Aug 04 '18 at 21:10
  • 2
    I'm voting to close this question as off-topic because this is a server side error and without the source code or at least stacktraces from the server this is unanswerable. –  Aug 04 '18 at 21:13
  • Pshemo-so now my code looks like this, but got the same error: Document doc = Jsoup.connect(address).userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0").get(); – user3217883 Aug 04 '18 at 21:15
  • Thank you Pshemo. This user agent string worked! Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36 – user3217883 Aug 04 '18 at 21:20
  • Note that if you enter a website via a program that tries to protect itself against such accesses you might violate their ToS. Which might lead to consequences for you on this site, like an account or IP ban. Also note that there are many more simple techniques to differentiate a real browser from a mocking program. – Zabuzard Aug 04 '18 at 22:13

1 Answers1

2

Thanks to Pshemo in comments below, here is the answer:

    String address = "https://www.nasdaq.com/dividend-stocks/dividend-calendar.aspx?date=" +date;
    //Note: "userAgent required for this site to prevent java.net.SocketException: Connection reset" error
    String usrAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36";
    Document doc = Jsoup.connect(address).userAgent(usrAgent).get();
user3217883
  • 1,216
  • 4
  • 38
  • 65