How to be sure if a file descriptor has already been closed?

Question

Before voting to close, please read, I know that there are similar questions (:

Here's my situation - I have an application, that is multithreaded. So, lets say I have 10 threads. All of them read from the same file descriptor (it's actually a socket ). And in a very rare situation, when a critical error occurs, the socket should be shutdown by one of the threads. The thing is, that any of these thread can do this. If the closing of the socket has failed, _Exit( FAILURE ) is executed (I know, that this sounds like an awful design or problem in the code, but it actually isn't, as this is caused by a non-opensource 3rd party lib, that has a bug).

And here's the problem situation - it's possible all of them to try to shutdown the socket in the same time. And one closes it, but the others cannot close it (shutdown returns -1, as the socket is already closed) and the bad _Exit( FAILURE ) is executed and that ruins everything.

Obviously, I need an additional check - if the socket is already closed (it's possible all threads to have failed shutting down the socket for some reason, and then at least one must execute _Exit, that's why checking the return code of shutdown is not enough).

Well, I found this question and it looks like that's exactly what I'm trying to do. But I know, that any kind of system calls take time (of course ) and it's OS dependent when exactly the socket will be closed.

And here's the question - how can I make difference if a socket is already closed or it cannot be closed for some reason? Will the fcntl ensure me, that if one thread has closed the socket and at the same time if other thread try to shutdown the socket, it will fail and then, if I make this check ( with fcntl ), this will work for me?

I also saw the other answers like: "you can use select or poll", but they are still system calls and I don't know if they will be the better choice. I also don't know how exactly to use them, but it's not a big deal, I guess.

Thanks!

I can also check the errno set by shutdown, but what does "connected" mean? And what is the difference between "connected" and "not a valid descriptor"

ENOTCONN
    The specified socket is not connected.

Also, what bothers me is, that the FD, I'm trying to close, could be invalid, as I take it from /proc/net/tcp mapped with proc/PID/fd and I don't know if all files will look like the way, they look on my OS (the OS will be for sure RHEL4 or RHEL5, if that matters)

Doh! It's damn long, but I can't explain it shorter.

score 4 · Accepted Answer · answered Apr 18 '11 at 18:05

4

I assume you're saying it's possible for your application to reasonably continue after shutting down the socket?

It seems that a better approach would be to have a mediator thread that gets notified of socket close requests from any of the worker threads, notifies the other threads that the socket is now dead, and takes care of closing the socket itself. This eliminates the worries about the reason for failure because it's all handled in a single thread.

answered Apr 18 '11 at 18:05

Mark B

95,107
10
109
188

Yes, the application can continue working, after shutting down the socket (too long for explanation). Hm, the idea with mediator thread sounds good to me. But it's gonna be rather hard to implement and support in the current architecture of the app. Anyway, I'll think about that, thanks! Other ideas are welcome, btw (: – Kiril Kirov Apr 18 '11 at 18:18
This turned out to be the best solution, really. 10x – Kiril Kirov Apr 19 '11 at 12:03

score 3 · Answer 2 · answered Apr 18 '11 at 23:24

3

Whenever you have a resource that's being used by more than one thread and which could be deallocated by one of them, you must protect all access with locks. Otherwise you will have dangerous race conditions. I would use a read-write lock on the int containing the file descriptor. Any thread wanting to use the fd should hold a read lock for the duration it uses it, and any thread wanting to change the fd variable (e.g. close it and replace it with -1 to prevent further use) should hold a write lock.

Basically this is the same as use of dynamically allocated memory and free.

answered Apr 18 '11 at 23:24

R.. GitHub STOP HELPING ICE

208,859
35
376
711

Yep, I know this, but this is a basic answer for thread synchronization at all. Suppose, that I do this locking, but what about other processes, that can change these files? `/proc/` is system folder and I can't lock anything inside it, can I? This makes parsing `tcp` file (or whatsoever) dangerous and could cause race condition. Right? So, I can't avoid race conditions, I could just reduce the probability for them, right? – Kiril Kirov Apr 19 '11 at 06:58
Parsing `tcp` (under `/proc`) for any purpose but informing the user sounds really dubious... – R.. GitHub STOP HELPING ICE Apr 19 '11 at 11:19
1

This is the correct answer. You have a shared resource, shared by multiple threads. You cannot allow one thread to deallocate the resource while another thread is using, might be using, or might try in the future to use, that resource. How you solve that is up to you. You can use a counter protected by a mutex. A thread must bump the counter to use the resource. A thread can only free the resource when the counter is zero. You can call 'shutdown' while another thread is or might be using the handle because that doesn't deallocate it. But 'close' does. So you must protect it. – David Schwartz Aug 14 '11 at 19:42

score 1 · Answer 3 · answered Apr 18 '11 at 17:57

1

Checking the errno is by far your best option. From shutdown(2) I can see:

EBADF s is not a valid descriptor
ENOTCONN The specified socket is not connected
ENOTSOCK s is a file, not a socket.

The way I see it: EBADF means it has already been closed and ENOCONN means there is not connection behind this descriptor (not three way handshake and all that jazz).

Best way to find out: do a perror(3) after the call to shutdown fails and see what it says.

Cheers

answered Apr 18 '11 at 17:57

nc3b

15,562
5
51
63

I thought it would be my best option, but do you think, that I can rely on the format of `/proc/net/tcp` and `proc/PID/fd` ? The good thing is, that the only possible OS-s are RHEL{4, 5}. – Kiril Kirov Apr 18 '11 at 18:02
2

@Kiril Kirov It's not a good idea. Suppose you have each thread check `/proc/PID/fd`. So one thread checks, sees it's there and then the scheduler stops it. Then another thread sees the `fd` is there and closes it. Now the first thread is resumed and it attempts to close it. Voila, **race condition**. – nc3b Apr 18 '11 at 18:07
@nc3b - argh, I haven't thought about that.. Thanks, I'll try to figure something out. – Kiril Kirov Apr 18 '11 at 18:20
If there's a possibility of other file descriptors being opened, this approach is not reliable; you could end up probing the wrong fd. – R.. GitHub STOP HELPING ICE Apr 18 '11 at 23:20
@nc3b - you could check http://stackoverflow.com/questions/5713451/is-it-safe-to-parse-a-proc-file if you're interested in – Kiril Kirov Apr 19 '11 at 12:02

score 0 · Answer 4 · answered Nov 25 '12 at 13:22

The correct answer was by R.. which dealt with threading problems. Reply to the question 'how to find out the status of an fd?': fstat()

Per POSIX: It will return -1 on any problem with a file descriptor or socket. errno will be set to EBADF on a file descriptor or socket that is already closed. the stat family of calls is specifically meant to test: files (stat), links( lstat) and file descriptors (fstat).

Other calls like close and shutdown also do return errors on already closed fd's. But their purpose is something else, and testing for a connected socket is a side effect.

Posted because the other syscalls mentioned are not meant to test file descriptors. Part of the original question as I read it.

score 0 · Answer 5 · answered Apr 18 '11 at 18:27

Have you thought of protecting the closing of the socket with a mutex? By using one of those you could make sure that only one thread tries to close the socket at all. You would have to make some systemcalls, namely pthread_mutex_init on initialization and pthread_mutex_trylock before actually closing the socket. However, these calls should be optimized for fast returns.

Another approach that avoids systemcalls would be to implement a mutex yourself, so that only one of the threads actually gets to close the socket. Of course you would have to adapt one of the existing algorithms for mutual exclusion so that the later threads do not wait, but simply continue execution.

Sure, but would be a part of the solution, @Mark B suggests - mediator thread. 10x anyway — Kiril Kirov, Apr 18 '11 at 19:12

How to be sure if a file descriptor has already been closed?

5 Answers5