This my first post here so don't hesitate to tell me if I'm doing something wrong.
I'm made a few line of code in Java 11
to get information from a webstore with JSOUP (1.14.2)
. Since the webstore as multiple page of data I'm using a loop to get all the url I want.
Here's a simplified exemple of what I'm doing :
for (int i = 1; i < 36; i++) {
String url = ("https://www.play-in.com/rachat/hotlist/magic?p=" + i);
try {
doc = Jsoup.connect(url).get();
} catch (Exception e) {
logger.info("Impossible de récuppérer les éléments de la page " + i + " : " + e);
}
// here i'm parsing the HTML to return an array of object
}
When I run the programme I get :
[main] INFO service.MagicBazarReader - Failed to get data from page 2 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[main] INFO service.MagicBazarReader - Failed to get data from page 3 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[main] INFO service.MagicBazarReader - Failed to get data from page 4 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[main] INFO service.MagicBazarReader - Failed to get data from page 5 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[main] INFO service.MagicBazarReader - Failed to get data from page 6 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[...]
[main] INFO ISmellProfits - Number of result after HTML parsing : 24
And so on.
So the first get()
is always a sucess and I can manipulate the result but then it seems to have an issued when calling multipe Jsoup.connect()
.
Since I'm calling an HTTPS
url my first thought was a certificate issue and I tried this solution How to connect via HTTPS using Jsoup? but it didn't helped. And if a certificate
was really needed I shouldn't be able to have acces to the url the first time, but I might be wrong here since I don't know a lot on this domain.
Second thought was to use parallel stream :
List <String> links = new ArrayList<>();
for (int i = 1; i < 36; i++) {
String url = ("https://www.play-in.com/rachat/hotlist/magic?p=" + i);
links.add(url);
}
links.parallelStream().forEach(link - > {
Document doc = new Document("");
try {
doc = Jsoup.connect(link).get();
// here i'm parsing the HTML to return an array of object
} catch (Exception e) {
logger.info("Impossible de récuppérer les éléments de la page " + link.substring(link.length() - 2) + " : " + e);
}
});
I have better results but it's still not perfect :
[ForkJoinPool.commonPool-worker-17] INFO service.MagicBazarReader - Failed to get data from page 12 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[ForkJoinPool.commonPool-worker-23] INFO service.MagicBazarReader - Failed to get data from page 30 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[ForkJoinPool.commonPool-worker-9] INFO service.MagicBazarReader - Failed to get data from page 5 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[ForkJoinPool.commonPool-worker-27] INFO service.MagicBazarReader - Failed to get data from page 35 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[main] INFO service.MagicBazarReader - Failed to get data from page 22 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[ForkJoinPool.commonPool-worker-3] INFO service.MagicBazarReader - Failed to get data from page 1 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[ForkJoinPool.commonPool-worker-7] INFO service.MagicBazarReader - Failed to get data from page 2 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[ForkJoinPool.commonPool-worker-21] INFO service.MagicBazarReader - Failed to get data from page 17 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[ForkJoinPool.commonPool-worker-5] INFO service.MagicBazarReader - Failed to get data from page 31 : javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
[...]
[main] INFO ISmellProfits - Number of result after HTML parsing : 286
So I'm getting a lot more results after HTML
parsing, but they are not consistent since i have a different number on every run and i'm still getting SSLHandshakeException
.
I'm getting out of idea so I'm asking if someone know what is causing the exception to be thrown.
I'm new to using JSOUP so I still don't know it well.
I think it could be that JSOUP can only have on connection at a time an the loop is calling the new one before the first one is closed.
Thanks for reading.