0

my system Debian 9

java --version

java 9.0.4
Java(TM) SE Runtime Environment (build 9.0.4+11)
Java HotSpot(TM) 64-Bit Server VM (build 9.0.4+11, mixed mode)

i run my program (that does looots of crawling in parallel) and after a while i am getting:

java.lang.StackOverflowError
        at java.base/java.net.SocketInputStream.socketRead0(Native Method)
        at java.base/java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.base/java.net.SocketInputStream.read(SocketInputStream.java:171)
        at java.base/java.net.SocketInputStream.read(SocketInputStream.java:141)
        at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
        at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:746)
        at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:689)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.doTunneling(HttpURLConnection.java:2074)
        at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:854)
        at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:689)
        at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:859)
        at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:689)
        at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:859)
        at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:689)
        at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:859)
        at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:689)
        at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:859)

... lots of repeating in this stack trace without any reference to my code

after that:

Starting coordinated shutdown from JVM shutdown hook

And that is the end.

I am using latest jsoup but i think this bug is related to the JDK and i dont know how to deal with this.

jsoup code/scala (may or may not be relevant)

    val con = Jsoup.connect(url)

    con.headers(headers.asJava)
    con.userAgent(agent)
    con.followRedirects(true)
    con.validateTLSCertificates(false)
    con.ignoreHttpErrors(true)
    con.maxBodySize(1024 * 1024 * 3)
    con.cookies(lastCookies.asJava)
    con.referrer(referrer)
    _setup.map(_.proxy.toProxy).foreach(con.proxy)
    con.timeout(connectionTimeout.toMillis.toInt)

    val r = con.execute()

    lastCookies = r.cookies().asScala.toMap[String, String]

    val parsed = r.parse()

any suggestions are welcome

Michal
  • 150
  • 3
  • 13
  • ..and your code throwing the above exception? – Naman Feb 12 '18 at 11:51
  • .. it might be Your code that is causing the problem.. recursions? memory leak? I'm done guessing. – msfoster Feb 12 '18 at 11:53
  • Please, provide your code. I assume problem is there :) – tmucha Feb 12 '18 at 12:13
  • 2
    Can you identify if this happens with a specific HTTP site? Also when you say "lots of repeating" then do you mean lots of parseXXX methods. – Alan Bateman Feb 12 '18 at 12:31
  • @nullpointer i cant tell which code since the stacktrace does not contain any reference to my code – Michal Feb 12 '18 at 14:10
  • @msfoster yes i also have to guess :/ i dont really use recursion, and can a memory leak cause stackoverflow (?) – Michal Feb 12 '18 at 14:11
  • @tmucha i cant give you 10k lines of code and i dont see from the stacktrace where the problem may be – Michal Feb 12 '18 at 14:12
  • @AlanBateman don't know if there is some specific site, maybe, there is lots of : ``` con = Jsoup.connect(url) ... r = con.execute() ... r.parse() ``` – Michal Feb 12 '18 at 14:13
  • i just found this https://sourceforge.net/p/htmlunit/bugs/1002/ i also use con.followRedirects(true) i need to figure out if jsoup can be convinced to redirect only few times and if this is really the problem here – Michal Feb 12 '18 at 14:18
  • to forbid redirect did not help after 2 hours of parsing million web sites it crashed in same fashion – Michal Feb 12 '18 at 23:25

1 Answers1

0

StackOverflowError means you exhausted the call stack. Stack, as in well-known data structure used in computer programming (many languages, not just java).

Each time you call a method, entries are added to the call stack. And each time a called method terminates, entries are removed from the stack. The stack has a finite size (albeit usually very large), and so you can fill it up entirely.

So it looks like - and I am guessing since you didn't provide your code - you are calling lots of methods and none of them ever terminate.

This other Stack Overflow question from almost ten years ago, may help you.

Abra
  • 19,142
  • 7
  • 29
  • 41
  • Hi, thx, my code is quite complicated and from the stack trace i cant tell which part is problematic. I can tell that i dont use infinite recursion so i am not sure what to fix :/ will see. – Michal Feb 12 '18 at 14:07