61

Disclosure: the code I'm working on is for university coursework.

Background: The task I'm trying to complete is to report on the effect of different threading techniques. To do this I have written several classes which respond to a request from a client using Java Sockets. The idea is to flood the server with requests and report on how different threading strategies cope with this. Each client will make 100 requests, and in each iteration we're increasing the number of clients by 50 until something breaks.

Problem: repeatably, and consistently, an exception occurs:

Caused by: java.net.NoRouteToHostException: Cannot assign requested address
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)

This happens in several scenarios, including when both the client and server are running on localhost. Connections can be made successfully for a while, it's soon after trying to connect 150 clients that the exception is thrown.

My first thought was that it could be Linux's limit on open file descriptors (1024) but I don't think so. I also checked that any and all connections between the sockets are closed properly (i.e. within a correct finally block).

I'm hesitant to post the code because I'm not sure which parts would be the most relevant, and don't want to have a huge listing of code in the question.

Has anyone come across this before? How can I avoid the NoRouteToHostException?


EDIT (further questions are italicised)

Some good answers so far which point to either the The Ephemeral Port Range or RFC 2780. Both of which would suggest that I have too many connections open. For both it appears the number of connections which need to be made to reach this limit suggest that at some point I'm not closing connections.

Having debugged both client and server, both have been observed to hit the method call myJava-Net-SocketInstance.close(). This would suggest that connections are being closed (at least in the non-exceptional case). Is this a correct suggestion?

Also, is there an OS level wait required for ports to become available again? It would be a possibility to run the program a separate time for each 50+ clients if it would just require a short period (or optimistically, running a command) before running the next attempt.


EDIT v2.0

Having taken the good answers provided, I modified my code to use the method setReuseAddress(true) with every Socket connection made on the client. This did not have the desired effect, and I am still limited to 250-300 clients. After the program terminates, running the command netstat -a shows that there is a lot of socket connections in the TIME_WAIT status.

My assumption was that if a socket was in the TIME-WAIT status, and had been set with the SO-REUSEADDR option, any new sockets attempting to use that port would be able to - however, I am still receiving the NoRouteToHostException.

Is this correct? Is there anything else which can be done to solve this problem?

Grundlefleck
  • 124,925
  • 25
  • 94
  • 111
  • @Grundlefleck If you are not doing it already, try calling the default Socket() ctor so you get back an unconnected socket. Then setReuseAddress(true). Then connect(). You want to tell the stack to reuse the address before it tries to bind. – Duck Oct 16 '09 at 18:42
  • I was previously calling the constructor which connects (Socket(host,port)) and when I saw your comment a light bulb went on, and I slapped myself for being an idiot... but it didn't work, and the problem still remains :-( – Grundlefleck Oct 19 '09 at 20:42

6 Answers6

55

Have you tried setting:

echo "1" >/proc/sys/net/ipv4/tcp_tw_reuse

and/or

echo "1" >/proc/sys/net/ipv4/tcp_tw_recycle

These settings may make Linux re-use the TIME_WAIT sockets. Unfortunately I can't find any definitive documentation.

atomice
  • 3,062
  • 17
  • 23
  • 6
    Sorry, I have to say it: I LOVE YOU! – tsykora Nov 01 '13 at 11:16
  • @atomice Dude, Thanks a lot. I have been facing this problem since a week. You finally saved me. – Jemish Patel Aug 11 '14 at 14:32
  • This solved a nasty issue that was bugging us for months. – Andy D Jul 31 '15 at 13:53
  • 2
    Obligatory warning that enabling these will cause problems if some of the connections go over load balancers. See https://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html In particular, _When the remote host is in fact a NAT device, the condition on timestamps will forbid allof the hosts except one behind the NAT device to connect during one minute because they do not share the same timestamp clock. In doubt, this is far better to disable this option since it leads to difficult to detect and difficult to diagnose problems._ – dskrvk Aug 02 '16 at 17:57
  • Does this change require a restart? – Maralc Feb 12 '18 at 11:15
  • you are Genius. how you found this solution? It felt like magic. I was using 0.1 million concurrent users. It was giving me the error. but after this change, it is working fine. – dpk Nov 07 '18 at 11:51
  • You saved the day!! – Rishi Sep 03 '19 at 11:49
18

This may help:

The Ephemeral Port Range

Another important ramification of the ephemeral port range is that it limits the maximum number of connections from one machine to a specific service on a remote machine! The TCP/IP protocol uses the connection's 4-tuple to distinguish between connections, so if the ephemeral port range is only 4000 ports wide, that means that there can only be 4000 unique connections from a client machine to a remote service at one time.

So maybe you run out of available ports. To get the number of available ports, see

$ cat /proc/sys/net/ipv4/ip_local_port_range 
32768   61000

The output is from my Ubuntu system, where I'd have 28,232 ports for client connections. Hence, your test would fail as soon as you have 280+ clients.

sfussenegger
  • 35,575
  • 15
  • 95
  • 119
9

Cannot assign requested address is the error string for the EADDRNOTAVAIL error.

I suspect you are running out of source ports. There are 16,383 sockets in the dynamic range available for use as a source port (see RFC 2780). 150 clients * 100 connections = 15,000 ports - so you are probably hitting this limit.

Community
  • 1
  • 1
atomice
  • 3,062
  • 17
  • 23
5

If you're running out of source ports but aren't actually maintaining that many open connections, set the SO_REUSEADDR socket option. This will enable you to reuse local ports that are still in the TIME_WAIT state.

David Joyner
  • 22,449
  • 4
  • 28
  • 33
  • Apologies for keeping it at a Java level. When I instantiate the socket, I also call mySocket.setReuseAddress(true); which does not solve the problem. Do ports have to be bound in order to use this? – Grundlefleck Oct 15 '09 at 14:34
  • 1
    No. If you don't explicity bind() the stack will do it for you during connect(). – David Joyner Oct 15 '09 at 14:50
2

If you are closing 500 connection per second you will run out of sockets. If you are connecting to the same locations (web servers) that use keepalive you can implement connection pools, so you don't close and reopen sockets.

This will save cpu too.

Use of tcp_tw_recycle and tcp_tw_reuse can result in packets coming in from the previous connecction, that is why there is a 1 minute wait for the packets to clear.

user190941
  • 41
  • 4
0

For any other Java users that stumble across this question, I would recommend using connection pooling so connections are reused properly.

Bryan Larson
  • 485
  • 1
  • 9
  • 21
  • 2
    c3p0 connection pooling is for JDBC connection pooling, which is not the type of connections that the question is about. – Bernie Dec 21 '15 at 00:15