How to cleanly interrupt a thread blocking on a recv call?

Question

I have a multithreaded server written in C, with each client thread looking something like this:

ssize_t n;
struct request request;

// Main loop: receive requests from the client and send responses.
while(running && (n = recv(sockfd, &request, sizeof(request), 0)) == sizeof(request)) {
    // Process request and send response.
}
if(n == -1)
    perror("Error receiving request from client");
else if(n != sizeof(act))
    fprintf(stderr, "Error receiving request from client: Incomplete data\n");

// Clean-up code.

At some point, a client meets a certain criteria where it must be disconnected. If the client is regularly sending requests, this is fine because it can be informed of the disconnection in the responses; However sometimes the clients take a long time to send a request, so the client threads end up blocking in the recv call, and the client does not get disconnected until the next request/response.

Is there a clean way to disconnect the client from another thread while the client thread is blocking in the recv call? I tried close(sockfd) but that causes the error Error receiving request from client: Bad file descriptor to occur, which really isn't accurate.

Alternatively, is there a better way for me to be handling errors here?

Possible duplicate of http://stackoverflow.com/questions/6910335/interrupting-syscalls-in-threads-on-linux — Drew McGowen, Jul 23 '13 at 22:25
@DrewMcGowen Thanks, I had a look for similar threads but I managed to miss that one. Still not sure it's quite the same question though, as completely killing the thread isn't really what I would consider clean. — DanielGibbs, Jul 23 '13 at 22:37
It does mention that it uses any installed signal handlers, so you could perhaps use that to clean up — Drew McGowen, Jul 23 '13 at 22:38
@DanielGibbs: "kill" is an unfortunate name for a function whose purpose is to send signals. Perhaps in an alternate reality, `pthread_kill()` is called `pthread_send_signal()`. But yes, you can use a signal to cleanly interrupt a `recv()`. — Dietrich Epp, Jul 23 '13 at 22:47
You close the fd, the recv call returns with an error, 'Bad file descriptor' that fairly accurately describes what has happened, (the fd has been closed from another thread). What more could you possibly want? — Martin James, Jul 23 '13 at 23:07
@MartinJames I'd like to be able to distinguish between planned disconnection, and an actual error (where I did not expect the socket to be closed). — DanielGibbs, Jul 23 '13 at 23:09
Planned disconnection does not generate that particular error/exception. The peer closing the socket causes the recv() call to return having read 0 bytes. Stop worrying too much and just close the soddin' fd :) — Martin James, Jul 23 '13 at 23:12
@MartinJames Yes it does. If I call `close(sockfd)` from another thread, `recv` returns -1 and prints the message `Error receiving request from client: Bad file descriptor`. — DanielGibbs, Jul 23 '13 at 23:14
@MartinJames: That solution is absolutely wrong. It has extremely dangerous race conditions. Suppose `recv` got interrupted by a signal and the `close` happens while the signal handler is running. After the signal handler returns, `recv` gets restarted on the same fd number. If you're lucky, the fd is invalid and it returns withe `EBADF`. But if you're very unlucky, another thread opened a new fd, got the same fd number, and you just stole that other thread's input. — R.. GitHub STOP HELPING ICE, Jul 23 '13 at 23:16
@DanielGibbs yes, OK. Apart from closing the fd from another thread, what else generates that error/exception? — Martin James, Jul 23 '13 at 23:16
A safe replacement for the `close` approach, however, would be to use `shutdown` instead. Basically, it half-closes or fully-closes the TCP connection but leaves the file descriptor around. — R.. GitHub STOP HELPING ICE, Jul 23 '13 at 23:17
@R.. don't allow the server<>client threads to handle signals. Have you no other process threads to handle the signals? I've been using the absolutely wrong solution for 35 years. No problems yet. — Martin James, Jul 23 '13 at 23:20
@MartinJames: Syscall restarting can also happen if the process is suspended with `SIGSTOP`, and possibly in other ways, such as execution under a debugger or strace. Normally this restarting should be transparent to the application, but if you do bad things like closing a file descriptor while using it, the restarting might have observable (and very undesirable) effects. — R.. GitHub STOP HELPING ICE, Jul 24 '13 at 03:54

Duck · Accepted Answer · 2013-07-23T23:03:15.483

14

So you have at least these possibilities:

(1) pthread_kill will blow the thread out of recv with errno == EINTR and you can clean up and exit the thread on your own. Some people think this is nasty. Depends, really.

(2) Make your client socket(s) non-blocking and use select to wait on input for a specific period of time before checking if a switch used between the threads has been set to indicated they should shut down.

(3) In combo with (2) have each thread share a pipe with the master thread. Add it to the select. If it becomes readable and contains a shutdonw request, the thread shuts itself down.

(4) Look into the pthread_cancel mechanism if none of the above (or variations thereof) do not meet your needs.

edited Jul 23 '13 at 23:03

answered Jul 23 '13 at 22:43

Duck

26,924
5
64
92

Right, so if I call `pthread_kill` with say `SIGUSR1`, and then check if `errno` is `EINTR`, then that should work? Will I need to add a signal handler to the thread for `SIGUSR`? If so, what should it do? – DanielGibbs Jul 23 '13 at 22:52
1

The signal handler sets a switch. When you get EINTR you check the switch and if it on then you clean up and get out. I answered a question somewhat similar to this the other day which more or less covers the basics http://stackoverflow.com/a/17607149/63743 – Duck Jul 23 '13 at 22:59
1

`pthread_kill` will only cause `EINTR` if you install the signal handler without the `SA_RESTART` option. This approach also has race conditions the result in the signal getting lost, but you may not care about that, and it's not "library-safe" in the sense that you have to modify the program's global state (signal disposition) to use it. `pthread_cancel` is really the correct solution. – R.. GitHub STOP HELPING ICE Jul 23 '13 at 23:09
Ah yes, I see. How would I go about setting a switch on a per-thread basis? – DanielGibbs Jul 23 '13 at 23:10
You could make the switch thread-local, but really, this is getting into the realm of bad hacks. `pthread_cancel` does exactly what you want; just setup the right cleanup handlers. The only time `pthread_cancel` would not be the right solution is if you needed to break out of the `recv` but keep the thread running. – R.. GitHub STOP HELPING ICE Jul 23 '13 at 23:14
The signal handler runs in the context of the current thread so you can get around the global but R is 100% correct that this becomes ugly pretty quick. I was hoping you would choose one of the other options. I am not as adamant as R that `pthread_cancel` is always the best choice but this is definitely the worst in terms of magnifying complexity and side effects. – Duck Jul 23 '13 at 23:36
Sorry, I forgot to answer your question. The easiest (if not completely fool proof) way is to not set a per-thread switch. If you are sure nothing else is going send SIGUSR1 then your main thread is in control. It does a pthread_kill immediately followed by pthread_join thereby making the series of shutdowns synchronous. Each worker thread has to reset the global switch off before exiting so it can be used by the next. Again, not a perfect solution. – Duck Jul 24 '13 at 00:14
@Duck: If the switch is a global variable, you risk having the wrong thread interpret it if it somehow wakes up from `recv` by some means other than being the target of `pthread_kill`. I think that's a very fragile design. If you're going to go with the switch, just put `_Thread` (or `__thread` on gcc) on it and be safe. – R.. GitHub STOP HELPING ICE Jul 24 '13 at 03:56
@DanielGibbs - R is correct. There is that small window of opportunity with the global and what he suggests is safer. – Duck Jul 24 '13 at 04:44
what does "clean up" means in the first comment ? – Bionix1441 Dec 02 '16 at 16:39

score 4 · Answer 2 · answered Jul 24 '13 at 00:34

4

Shutdown the socket for input from another thread. That will cause the reading thread to receive an EOS, which should cause it to close the socket and terminate if it is correctly written.

answered Jul 24 '13 at 00:34

user207421

305,947
44
307
483

1

Note: I think when @EJP says 'shutdown' he means the literal call to `shutdown` i.e. a call to `close` will NOT interrupt the thread. – chacham15 Dec 17 '14 at 01:06
@checham 'shutdown the socket for input' only has one meaning. I didn't use the word 'close' anywhere. You're not really helping. – user207421 Dec 17 '14 at 01:44
3

It is a natural assumption to think that close would cause a shutdown. Therefore, I wrote my comment to clarify for other people who might misunderstand as I did. No need to be hostile. – chacham15 Dec 17 '14 at 02:48
@chacham15 `close()` *does* cause a shutdown, not that it's relevant. Your point escapes me. – user207421 Aug 18 '16 at 02:01
3

`shutdown()` is not guaranteed to unblock a blocking socket call that is already in progress. Sometimes it does, sometimes is doesn't. It depends on the platform. – Remy Lebeau Aug 02 '17 at 23:43
Correct answer, might need more context from this answer https://stackoverflow.com/a/62356967/70405 . Don't mind the boost involvement, solution is at socket API level. – Alex Jun 13 '20 at 09:35
@RemyLebeau So please provide a platform where it doesn't. – user207421 Jun 13 '20 at 09:41

numzero · Answer 3 · 2019-01-30T22:46:35.137

To interrupt the thread, make the socket non-blocking (set O_NONBLOCK using fcntl) and then signal the thread with pthread_kill. This way, recv will fail with either EINTR if it was sleeping, or EAGAIN or EWOULDBLOCK if it wasn’t (also maybe if SA_RESTART is in effect, didn’t check). Note that the socket doesn’t need to, and actually should not, be non-blocking before that. (And of course the signal needs to be handled; empty handler is sufficient).

To be sure to catch the stop-signal but not anything else, use a flag; there are things that may go wrong. For example, recv may fail with EINTR on some spurious signal. Or it may succeed if there was some data available, effectively ignoring the stop request.

And what not to do:

Don’t use pthread_kill alone or with any plain check. It may arrive right before issuing the recv syscall, too early to interrupt it but after all the checks.
Don’t close the socket. That may not even work, and as @R.. pointer out, is dangerous as the socket file descriptor may be reused between close and recv (unless you’re sure nothing opens file descriptors).

How could it *not* have been sleeping if it was blocked? And it if wasn't blocked there is no question to answer. You will only get EAGAIN/EWOULDBLOCK if there was a read timeout in effect and it expired. — user207421, Mar 03 '19 at 02:53
@user207421 There is no way to make sure it entered the blocking syscall already. It might be preempted right before issuing the call, for arbitrarily long time. — numzero, Mar 10 '19 at 00:25

How to cleanly interrupt a thread blocking on a recv call?

3 Answers3

Linked