I have like pages something like this:
www.foo1.bar
www.foo2.bar
www.foo3.bar
.
.
www.foo100.bar
I am using library jsoup and connecting to each page in the same time with Thread :
Thread matchThread = new Thread(task);
matchThread.start();
Each task, connect to page like this, and parses HTML:
Jsoup.connect("www.fooX.bar").timeout(0).get();
Getting tons of these exceptions:
java.net.ConnectException: Connection timed out: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at sun.net.NetworkClient.doConnect(NetworkClient.java:158)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:388)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:523)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:227)
at sun.net.www.http.HttpClient.New(HttpClient.java:300)
at sun.net.www.http.HttpClient.New(HttpClient.java:317)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:404)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:391)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:157)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:146)
Does jsoup allow only 1 thread simultaneously? Or what am I doing wrong? Any suggestions, of how to connect to my pages faster, since going one by one takes ages.
EDIT:
All 700 threads using this method, maybe this is the problem or something. Can this method handle this amount of threads or it is singleton?
private static Document connect(String url) {
Document doc = null;
try {
doc = Jsoup.connect(url).timeout(0).get();
} catch (IOException e) {
System.out.println(url);
}
return doc;
}
EDIT: whole thread code
public class MatchWorker implements Callable<Match>{
private Element element;
public MatchWorker(Element element) {
this.element = element;
}
@Override
public Match call() throws Exception {
Match match = null;
Util.connectAndDoStuff();
return match;
}
}
MY ALL 700 ELEMENTS:
Collection<Match> matches = new ArrayList<Match>();
Collection<Future<Match>> results = new ArrayList<Future<Match>>();
for (Element element : elements) {
MatchWorker matchWorker = new MatchWorker(element);
FutureTask<Match> task = new FutureTask<Match>(matchWorker);
results.add(task);
Thread matchThread = new Thread(task);
matchThread.start();
}
for(Future<Match> match : results) {
try {
matches.add(match.get());
} catch (Exception e) {
e.printStackTrace();
}
}