I have to make simultaneous tcp socket connections every x seconds to multiple machines, in order to get something like a status update packet.
I use a Callable thread class, which creates a future task that connects to each machine, sends a query packet, and receives a reply which is returned to the main thread that creates all the callable objects.
My socket connection class is :
public class ClientConnect implements Callable<String> {
Connection con = null;
Statement st = null;
ResultSet rs = null;
String hostipp, hostnamee;
ClientConnect(String hostname, String hostip) {
hostnamee=hostname;
hostipp = hostip;
}
@Override
public String call() throws Exception {
return GetData();
}
private String GetData() {
Socket so = new Socket();
SocketAddress sa = null;
PrintWriter out = null;
BufferedReader in = null;
try {
sa = new InetSocketAddress(InetAddress.getByName(hostipp), 2223);
} catch (UnknownHostException e1) {
e1.printStackTrace();
}
try {
so.connect(sa, 10000);
out = new PrintWriter(so.getOutputStream(), true);
out.println("\1IDC_UPDATE\1");
in = new BufferedReader(new InputStreamReader(so.getInputStream()));
String [] response = in.readLine().split("\1");
out.close();in.close();so.close(); so = null;
try{
Integer.parseInt(response[2]);
} catch(NumberFormatException e) {
System.out.println("Number format exception");
return hostnamee + "|-1" ;
}
return hostnamee + "|" + response[2];
} catch (IOException e) {
try {
if(out!=null)out.close();
if(in!=null)in.close();
so.close();so = null;
return hostnamee + "|-1" ;
} catch (IOException e1) {
// TODO Auto-generated catch block
return hostnamee + "|-1" ;
}
}
}
}
And this is the way i create a pool of threads in my main class :
private void StartThreadPool()
{
ExecutorService pool = Executors.newFixedThreadPool(30);
List<Future<String>> list = new ArrayList<Future<String>>();
for (Map.Entry<String, String> entry : pc_nameip.entrySet())
{
Callable<String> worker = new ClientConnect(entry.getKey(),entry.getValue());
Future<String> submit = pool.submit(worker);
list.add(submit);
}
for (Future<String> future : list) {
try {
String threadresult;
threadresult = future.get();
//........ PROCESS DATA HERE!..........//
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}
}
The pc_nameip map contains (hostname, hostip) values and for every entry i create a ClientConnect thread object.
My problem is that when my list of machines contains lets say 10 pcs (which most of them are not alive), i get a lot of timeout exceptions (in alive pcs) even though my timeout limit is set to 10 seconds.
If i force the list to contain a single working pc, I have no problem. The timeouts are pretty random, no clue what's causing them.
All machines are in a local network, the remote servers are written by my also (in C/C++) and been working in another setup for more than 2 years without any problems.
Am i missing something or could it be an os network restriction problem? I am testing this code on windows xp sp3. Thanks in advance!
UPDATE:
After creating two new server machines, and keeping one that was getting a lot of timeouts, i have the following results :
For 100 thread runs over 20 minutes :
NEW_SERVER1 : 99 successful connections/ 1 timeouts
NEW_SERVER2 : 94 successful connections/ 6 timeouts
OLD_SERVER : 57 successful connections/ 43 timeouts
Other info : - I experienced a JRE crash (EXCEPTION_ACCESS_VIOLATION (0xc0000005)) once and had to restart the application. - I noticed that while the app was running my network connection was struggling as i was browsing the internet. I have no idea if this is expected but i think my having at MAX 15 threads is not that much.
So, fisrt of all my old servers had some kind of problem. No idea what that was, since my new servers were created from the same OS image.
Secondly, although the timeout percentage has dropped dramatically, i still think it is uncommon to get even one timeout in a small LAN like ours. But this could be a server's application part problem.
Finally my point of view is that, apart from the old server's problem (i still cannot beleive i lost so much time with that!), there must be either a server app bug, or a JDK related bug (since i experienced that JRE crash).
p.s. I use Eclipse as IDE and my JRE is the latest.
If any of the above ring any bells to you, please comment. Thank you.
-----EDIT-----
Could it be that PrintWriter and/or BufferedReader are not actually thread safe????!!!?
----NEW EDIT 09 Sep 2013----
After re-reading all the comments and thanks to @Gray and his comment :
When you run multiple servers does the first couple work and the rest of them timeout? Might be interesting to put a small sleep in your fork loop (like 10 or 100ms) to see if it works that way.
I rearanged the tree list of the hosts/ip's and got some really strange results. It seems that if an alive host is placed on top of the tree list, thus being first to start a socket connection, has no problem connecting and receiving packets without any delay or timeout.
On the contrary, if an alive host is placed at the bottom of the list, with several dead hosts before it, it just takes too long to connect and with my previous timeout of 10 secs it failed to connect. But after changing the timeout to 60 seconds (thanks to @EJP) i realised that no timeouts are occuring!
It just takes too long to connect (more than 20 seconds in some occasions). Something is blobking new socket connections, and it isn't that the hosts or network is to busy to respond.
I have some debug data here, if you would like to take a look : http://pastebin.com/2m8jDwKL