Avoid reusing of the same fd number in a multithread socket application

Question

I have an asynchronous application executing several threads doing operations over sockets where operations are scheduled and then executed asynchronously.

I'm trying to avoid a situation when once scheduled a read operation over a socket, the socket gets closed and reopened(by possibly another peer in another operation), before the first operation started execution, which will end up reading the proper file descriptor but the wrong peer.

The problem comes because (accept();close();accept()) returns the same fd in both accepts() which can lead to the above situation.

I can't see a way of avoiding it.

any hint?

why is the socket that blongs to one thread being closed by another thread? — Omry Yadan, Oct 01 '09 at 14:51
thats the nature of an asynchronous system, one operation reads, another writes, another may close, and all of them may execute in different threads. — Arkaitz Jimenez, Oct 01 '09 at 14:53
rethink about your approach. typically there is a single thread that accepts connections, and spawn threads to handle them. those threads are then responsible for closing the connection, and no other thread gets to even access that connection. — Omry Yadan, Oct 01 '09 at 14:58
I don't think Omry is really wrong - his point is that multiple threads accessing the same socket invite a world of pain. Having all access to 1 socket handled by 1 thread does not mean you cannot have 1 thread handling multiple sockets. — Steg, Oct 01 '09 at 16:03
well, with NIO (Non blocking IO, select/poll and friends) you don't even need more than one thread to deal with the sockets anyway. — Omry Yadan, Oct 01 '09 at 16:06

score 3 · Accepted Answer · answered Oct 01 '09 at 16:28

Ok, found the answer.

The best way here is to call accept() and get the lowest fd available, duplicate it with a number known by you like dup2(6,1000) and close(6), you have now control of the fd range you use.

Next accept will come again with 6 or similar, and we'll dup2(6,999); and keep decreasing like that and reseting it if it gets too low.

Since the accepting is done always in the same thread and dup2 and close aren't expensive compared to accept which is always done there it's perfect for my needs.

score 2 · Answer 2 · answered Oct 01 '09 at 14:52

2

How do you manage the sockets? It sounds like you have multiple threads any of which can:

accept an incoming connection
close an existing connection
make a new outgoing connection

It sounds like you need a way to mediate access to the various sockets floating around. Have you considered associating each socket with mutex which prevents closing the socket while it's still in use, or maybe putting each socket descriptor in a struct with an atomic reference count which will prevent other threads from closing it until all threads are done using it?

answered Oct 01 '09 at 14:52

Robert S. Barnes

39,711
30
131
179

I could do it that way, but adding mutexes to this almost lockfree system is something I try to avoid. The best thing would be to guarantee accept won't give fd sequentially, but I can't see that happening. – Arkaitz Jimenez Oct 01 '09 at 15:17
@Arkaitz - then the second solution I mentioned might be good for you. Keep the socket in a struct with a reference counter and just don't let any threads close a socket with a reference count above zero. No lock, and the socket doesn't get closed until all the threads are done with it. – Robert S. Barnes Oct 01 '09 at 17:46

score 1 · Answer 3 · answered Oct 01 '09 at 15:03

1

a socket is a 5-tuple {local-addr,local-port,remote-addr,remote-port,proto}, so if you are able to use these properties instead of fd for event/handler routing you can avoid the fd clash.

another option would be to serialize all close()/accept() operations (priorities ?) so that they cannot intermix

answered Oct 01 '09 at 15:03

catwalk

6,340
25
16

1

I'd still need mutexes to protect those structs, which will slow the system more, being an edge situation I'm trying to avoid locking there. – Arkaitz Jimenez Oct 01 '09 at 15:18

score 1 · Answer 4 · answered Oct 01 '09 at 15:26

1

Great question! I didn't even realize that such a problem could occur.

The only answer that I can think of is that you musn't use close() to signal that a socket is terminated. One solution is to use shutdown() to terminate the connection. You could then close() the socket safely by employing reference counting.

answered Oct 01 '09 at 15:26

TrayMan

7,180
3
24
33

The `close` function decrements the sockets ref count in the underlying OS, `shutdown` forces a close and sends an EOF / FIN to the peer. See this SO post for more details: http://stackoverflow.com/questions/409783/socket-shutdown-vs-socket-close/598759#598759 – Robert S. Barnes Oct 01 '09 at 17:50
As mentioned in the link you posted, shutdown() does not close the socket, merely terminates communication. Shutdown() would cause any pending (or future) I/O operation to fail, but would not deallocate the FD, thus the described problem would not occur. In the event that an operaton fails, the thread doing the operation would decrements a reference count (which had been previously incremented when the thread received the socket), if that goes zero it closes the socket. Obviously this refcount would be something implemented in the program, not the OS refcount. – TrayMan Oct 02 '09 at 04:45
You don't even need to worry about a reference count, just close the socket if the read operation fails. – atomice Oct 02 '09 at 15:40
@TrayMan - The OP said that part of the problem is that pending operations need to complete before the connection is torn down. Your solution is incomplete because it doesn't address this part of the problem. – Robert S. Barnes Oct 12 '09 at 11:09
That's not how I read it. But to do that, you just need to drop the shutdown part from my solution. The last thread operating on the socket will then close it. – TrayMan Oct 13 '09 at 18:50

score 1 · Answer 5 · answered Oct 01 '09 at 21:56

1

I would still be careful of using dup2() to a well-known fd value. Remember that dup2() will perform a close on the target before duping; that could conflict with some unrelated thread doing unrelated I/O if you start to have 1000 files open.

If I were you, given the constraints you're insisting upon, I would use dup() (not dup2()) inside of a mutex. (Maybe per-fd mutexes if you're that concerned about it.)

answered Oct 01 '09 at 21:56

asveikau

39,039
2
53
68

Thing is accept & dup2 and all will execute always on the same thread. So I could go from MAX_FD to 10 or so and then start again, the posibility of dup2()ing a fd to MAX_FD when MAX_FD just closed because we started again is not really to be taken in account. – Arkaitz Jimenez Oct 02 '09 at 06:38
But can you prove that you're not doing unrelated file opens? Or maybe some library is doing thme on your behalf? And can you prove that the count of naturally-ocurring FDs won't eventually reach 1000? Maybe you can guarantee these things, but this sort of thing would make me nervous with your approach. – asveikau Oct 05 '09 at 18:46

score 0 · Answer 6 · answered Oct 01 '09 at 15:11

Keep a count of the pending operations (read/write, etc.) for each socket and also whether there is a pending close request on the socket. Where before you would have called close, check first whether there are any pending operations. If there are, call shutdown instead, and then only call close when the pending operations count reaches 0.

Avoid reusing of the same fd number in a multithread socket application

6 Answers6