109

From the man page:

SO_REUSEADDR Specifies that the rules used in validating addresses supplied to bind() should allow reuse of local addresses, if this is supported by the protocol. This option takes an int value. This is a Boolean option

When should I use it? Why does "reuse of local addresses" give?

Nawaz
  • 353,942
  • 115
  • 666
  • 851
Ray Templeton
  • 1,091
  • 2
  • 8
  • 3

3 Answers3

238

TCP's primary design goal is to allow reliable data communication in the face of packet loss, packet reordering, and — key, here — packet duplication.

It's fairly obvious how a TCP/IP network stack deals with all this while the connection is up, but there's an edge case that happens just after the connection closes. What happens if a packet sent right at the end of the conversation is duplicated and delayed, such that the 4-way shutdown packets get to the receiver before the delayed packet? The stack dutifully closes down its connection. Then later, the delayed duplicate packet shows up. What should the stack do?

More importantly, what should it do if a program with open sockets on a given IP address + TCP port combo closes its sockets, and then a brief time later, a program comes along and wants to listen on that same IP address and TCP port number? (Typical case: A program is killed and is quickly restarted.)

There are a couple of choices:

  1. Disallow reuse of that IP/port combo for at least 2 times the maximum time a packet could be in flight. In TCP, this is usually called the 2×MSL delay. You sometimes also see 2×RTT, which is roughly equivalent.

    This is the default behavior of all common TCP/IP stacks. 2×MSL is typically between 30 and 120 seconds, and it shows up in netstat output as the TIME_WAIT period. After that time, the stack assumes that any rogue packets have been dropped en route due to expired TTLs, so that socket leaves the TIME_WAIT state, allowing that IP/port combo to be reused.

  2. Allow the new program to re-bind to that IP/port combo. In stacks with BSD sockets interfaces — essentially all Unixes and Unix-like systems, plus Windows via Winsock — you have to ask for this behavior by setting the SO_REUSEADDR option via setsockopt() before you call bind().

SO_REUSEADDR is most commonly set in network server programs, since a common usage pattern is to make a configuration change, then be required to restart that program to make the change take effect. Without SO_REUSEADDR, the bind() call in the restarted program's new instance will fail if there were connections open to the previous instance when you killed it. Those connections will hold the TCP port in the TIME_WAIT state for 30-120 seconds, so you fall into case 1 above.

The risk in setting SO_REUSEADDR is that it creates an ambiguity: the metadata in a TCP packet's headers isn't sufficiently unique that the stack can reliably tell whether the packet is stale and so should be dropped rather than be delivered to the new listener's socket because it was clearly intended for a now-dead listener.

If you don't see that that is true, here's all the listening machine's TCP/IP stack has to work with per-connection to make that decision:

  1. Local IP: Not unique per-conn. In fact, our problem definition here says we're reusing the local IP, on purpose.

  2. Local TCP port: Ditto.

  3. Remote IP: The machine causing the ambiguity could re-connect, so that doesn't help disambiguate the packet's proper destination.

  4. Remote port: In well-behaved network stacks, the remote port of an outgoing connection isn't reused quickly, but it's only 16 bits, so you've got 30-120 seconds to force the stack to get through a few tens of thousands of choices and reuse the port. Computers could do work that fast back in the 1960s.

    If your answer to that is that the remote stack should do something like TIME_WAIT on its side to disallow ephemeral TCP port reuse, that solution assumes that the remote host is benign. A malicious actor is free to reuse that remote port.

    I suppose the listener's stack could choose to strictly disallow connections from the TCP 4-tuple only, so that during the TIME_WAIT state a given remote host is prevented from reconnecting with the same remote ephemeral port, but I'm not aware of any TCP stack with that particular refinement.

  5. Local and remote TCP sequence numbers: These are also not sufficiently unique that a new remote program couldn't come up with the same values.

If we were re-designing TCP today, I think we'd integrate TLS or something like it as a non-optional feature, one effect of which is to make this sort of inadvertent and malicious connection hijacking impossible. That requires adding large fields (128 bits and up) which wasn't at all practical back in 1981, when the document for the current version of TCP (RFC 793) was published.

Without such hardening, the ambiguity created by allowing re-binding during TIME_WAIT means you can either a) have stale data intended for the old listener be misdelivered to a socket belonging to the new listener, thereby either breaking the listener's protocol or incorrectly injecting stale data into the connection; or b) new data for the new listener's socket mistakenly assigned to the old listener's socket and thus inadvertently dropped.

The safe thing to do is wait out the TIME_WAIT period.

Ultimately, it comes down to a choice of costs: wait out the TIME_WAIT period or take on the risk of unwanted data loss or inadvertent data injection.

Many server programs take this risk, deciding that it's better to get the server back up immediately so as to not miss any more incoming connections than necessary.

This is not a universal choice. Many programs — even server programs requiring a restart to apply a settings change — choose instead to leave SO_REUSEADDR alone. The programmer may know these risks and is choosing to leave the default alone, or they may be ignorant of the issues but are getting the benefit of a wise default.

Some network programs offer the user a choice among the configuration options, fobbing the responsibility off on the end user or sysadmin.

Warren Young
  • 40,875
  • 8
  • 85
  • 101
  • 20
    Great and very helpful answer. BTW: note that the following quote [man 7 ip](http://linux.die.net/man/7/ip): *A TCP local socket address that has been bound is unavailable for some time after closing, unless the `SO_REUSEADDR` flag has been set. **Care should be taken when using this flag as it makes TCP less reliable.*** isn't very helpful without the above explanation. – patryk.beza Aug 06 '16 at 08:52
  • 7
    **Important correction**: *SO_REUSEADDR* on Windows/WinSock does not do what most people think it does – actually it does something quite horrible. Quoting [MSDN](https://learn.microsoft.com/de-de/windows/win32/winsock/using-so-reuseaddr-and-so-exclusiveaddruse#using-so_reuseaddr): *The SO_REUSEADDR socket option allows a socket to forcibly bind to a port in use by another socket. […]* – ntninja Oct 17 '19 at 19:58
  • *Once the second socket has successfully bound, the behaviour for all sockets bound to that port is indeterminate. For example, if all of the sockets on the same port provide TCP service, any incoming TCP connection requests over the port cannot be guaranteed to be handled by the correct socket — the behaviour is **non-deterministic**.* The MSDN page then goes further to explain that you should only ever use this on Multicast sockets (if at all) and that all other applications should use *SO_EXCLUSIVEADDRUSE* instead to reliably preventing malicious use of *SO_REUSEADDR* by other applications. – ntninja Oct 17 '19 at 20:03
  • 6
    To get the Unix *SO_REUSEADDR* behaviour on Windows call `setsockopt(sock, SOL_SOCKET, SO_DONTLINGER, &"\x00\x00\x00\x00", 4);` on the socket instead. – ntninja Oct 17 '19 at 20:06
  • 5
    For **UDP** sockets, *SO_REUSEADDR* is used instead for multicast. Basically, multiple sockets can bind to the same port, and they all receive the incoming datagrams. – leoll2 Oct 09 '20 at 10:27
  • 1
    Your `TIME_WAIT` refinement is a brilliant idea. – armani Mar 07 '21 at 11:22
47

SO_REUSEADDR allows your server to bind to an address which is in a
TIME_WAIT state.

This socket option tells the kernel that even if this port is busy (in the TIME_WAIT state), go ahead and reuse it anyway. If it is busy, but with another state, you will still get an address already in use error. It is useful if your server has been shut down, and then restarted right away while sockets are still active on its port.

From unixguide.net

William Briand
  • 854
  • 5
  • 10
  • 1
    Thanks a lot, I found this also but still didn't understand... – Ray Templeton Jul 12 '10 at 15:55
  • 10
    Let's say you open a TCP connection. After transmitting data, you close the socket. But in fact, it will be set in TIME_WAIT state (TIME_WAIT == "it's possible that some data have no been delivered yet, or something, so we wait as a cautionous TCP implementation :) ") for a while. You just can't open another connection to the same IP/port, except by using REUSEADDR. – William Briand Jul 12 '10 at 16:02
  • 1
    The penny dropped. Thank you very much! – Ray Templeton Jul 12 '10 at 16:05
  • The problem is that it is a TCP **connection** that is in a `TIME_WAIT` state. A connection is identified by two addresses: the remote port/ip and local port/ip. Note that when a TCP server isn't restarted (uses the same socket all the time) client connections are coming and going and cycling through `TIME_WAIT` states all the time. New clients are not blocked from connecting because previous clients are in `TIME_WAIT`. So why would that be the case just because the server was restarted and re-created the socket? The documented explanation holds no water. – Kaz Apr 30 '19 at 19:53
  • @Kaz It is a TCP *port* that is in a TIME_WAIT state, as shown by the fact that it only occurs at one end. See the state diagram in RFC 793. The other end can reuse its port immediately, and (as you say yourself) create another connection to the same target, if it's the client, even re-using the same source port. – user207421 May 02 '19 at 03:19
13

When you create a socket, you don't really own it. The OS (TCP stack) creates it for you and gives you a handle (file descriptor) to access it. When your socket is closed, it take time for the OS to "fully close it" while it goes through several states. As EJP mentioned in the comments, the longest delay is usually from the TIME_WAIT state. This extra delay is required to handle edge cases at the very end of the termination sequence and make sure the last termination acknowledgement either got through or had the other side reset itself because of a timeout. Here you can find some extra considerations about this state. The main considerations are pointed out as follow :

Remember that TCP guarantees all data transmitted will be delivered, if at all possible. When you close a socket, the server goes into a TIME_WAIT state, just to be really really sure that all the data has gone through. When a socket is closed, both sides agree by sending messages to each other that they will send no more data. This, it seemed to me was good enough, and after the handshaking is done, the socket should be closed. The problem is two-fold. First, there is no way to be sure that the last ack was communicated successfully. Second, there may be "wandering duplicates" left on the net that must be dealt with if they are delivered.

If you try to create multiple sockets with the same ip:port pair really quick, you get the "address already in use" error because the earlier socket will not have been fully released. Using SO_REUSEADDR will get rid of this error as it will override checks for any previous instance.

Eric
  • 19,525
  • 19
  • 84
  • 147
  • 1
    I think this phrase was most important and no one mentioned it before - "If you try to create multiple sockets with the same ip:port pair really quick, you get the "address already in use" error." Worth an up vote :) – ultimate cause Jan 17 '13 at 21:21
  • 3
    IT doesn't 'take time for the OS to "notice" it. There is a *defined TCP state* called TIME_WAIT in which the specification of the TCP protocol *requires* the operating system to maintain the port prior to finally releasing it. This state post-dates the closure of the associated socket. – user207421 Dec 09 '15 at 00:11
  • @user207421 _This state post-dates the closure of the associated socket_ - is this true with `SO_LINGER` enabled as well, i.e. does lingering in the foreground (blocking `close()`) not include `TIME_WAIT`? I know an `SO_LINGER` of zero will avoid `TIME_WAIT` altogether (since it avoids termination sequence entirely and sends `RST`). – haelix Oct 08 '18 at 00:08
  • Using SO_REUSEADDR once bit me hard on iOS - the server socket was seemingly open but all clients (localhost) failed to connect to it with ECONNRESET. It turned out that if the app on iOS is shut down unexpectedly, the server socket can linger for long time and be still alive, and with SO_REUSEADDR your new server sockets will be in some kind of a "ghost" state and drop client connections. When I disabled SO_REUSEADDR, I immediately noticed the issue because listen() failed, and I could work it around by using a loop with listen() until the ghost socket was gone and the new one was created. – JustAMartin Jan 16 '19 at 09:44
  • @haelix It makes no difference. Linger time does not include TIME_WAIT time. Linger time precedes the actual close by the OS, and TIME_WAIT time follows it. The only way you can make a difference is by setting SO_LINGER to reset the connection, which bypasses all the post-close states. – user207421 Mar 04 '19 at 04:39