15

I am looking for a way to guarantee that any time a thread locks a specific resource, it is forced to release that resource after a specific period of time (if it has not already released it). Envision a connection where you need to limit the amount of time any specific thread can own that connection for.

I envision this is how it could be used:

{
    std::lock_guard<std::TimeLimitedMutex> lock(this->myTimeLimitedMutex, timeout);
    try {
        // perform some operation with the resource that myTimeLimitedMutex guards. 
    }
    catch (MutexTimeoutException ex) {
        // perform cleanup
    }
}

I see that there is a timed_mutex that lets the program timeout if a lock cannot be acquired. I need the timeout to occur after the lock is acquired.

There are already some situations where you get a resource that can be taken away unexpectedly. For instance, a tcp sockets -- once a socket connection is made, code on each side needs to handle the case where the other side drops the connection.

I am looking for a pattern that handle types of resources that normally time out on their own, but when they don't, they need to be reset. This does not have to handle every type of resource.

Jay Elston
  • 1,978
  • 1
  • 19
  • 38
  • 5
    AFAIK only the opposite is provided. I believe you need to write your own. – NathanOliver Feb 08 '19 at 18:43
  • 4
    The tricky part will be deciding how the thread that currently owns the lock will be notified or otherwise realize that it's lock now belongs to someone else. – François Andrieux Feb 08 '19 at 18:53
  • 4
    Isn't that a bit against the principle of "owning" a resource? Also, implementing a timeout when you have a lock might get awful in some cases performance-wise. Imagine a timeout of of 500ms but it would have took 505ms to complete all the work. What happens then? – AlexG Feb 08 '19 at 18:54
  • 1
    I wonder how it could be implemented: if the thread is preempted by the os in the critical section, then the mutex expires. What happens when the os restart the thread? – Oliv Feb 08 '19 at 18:54
  • On recent linux kernel there is a way execute code and to restart the thread at a different rip address if the thread is preempted in the critical section. But I think only linux implement it. – Oliv Feb 08 '19 at 18:57
  • 4
    This sounds tricky. It may be that you will have to put regular checks in the worker thread whether or not to terminate. I mean what if you only partially modified the state of the resource leaving it in an unpredictable condition for the preempting thread to take over from? – Galik Feb 08 '19 at 18:59
  • 2
    It seems a very bad idea, I don't think that any sane system supports something like this. It generates **a lot** of problems. It has design questions: in what granularity should the timeout be checked? What if the thread is currently sleeping? How to maintain consistency, if suddenly the mutex is stolen? Etc., etc. – geza Feb 08 '19 at 19:05
  • I think the usual way to handle this sort of problem is simply to write your thread's code in such a way that it will be unlikely to hold a lock for any significant amount of time -- i.e. only do O(1), non-blocking operations while holding the lock. (Implementing that may require redesigning your data structures so you can use critical-section-minimizing techniques like double-buffering, pointer-swapping, etc) – Jeremy Friesner Feb 08 '19 at 19:06
  • 1
    Actualy it could be implemented using transactional memory, You loop over an atomic block and check at the end of each block if you still hold the mutex before starting a new atomic block. But it would not behave exactly as you expect. – Oliv Feb 08 '19 at 19:28
  • 4
    The thread that owns the lock periodically checks to see how long it has held the lock, and if it exceeds the threshold it relinquishes the lock and does whatever cleanup required. The concept is similar to cooperative multitasking, in contrast to the much more prevalent preemptive multitasking. Note: there's a reason preemptive multitasking is more prevalent, even though it is less efficient than cooperative multitasking. – Eljay Feb 08 '19 at 20:03
  • 1
    @Eljay the main reason why preemptive multitasking is prevalent is the same as why we have law enforcement instead of people just being kind to each other - human beings, as well as programs, are not very cooperative in general. In a more gentle world, all multitasking would be cooperative. – SergeyA Feb 08 '19 at 20:24
  • Is this perhaps an XY problem? What scenario do you find yourself in that makes you want to do this? (may deserve a different question) – Mr.Mindor Feb 08 '19 at 21:54
  • @Mr.Mindor Agreed. It sounds like what he really needs is watchdogs on the processes which can lock. Process hangs, watchdog kills process, resource is freed, everyone's happy. – Graham Feb 09 '19 at 00:29
  • @Eljay's suggestion has the advantage that the possible rolllback could be implemented more reliably than in case of caught signal at random moment. – max630 Feb 09 '19 at 01:54
  • A feature of real-time programing is that it requires tasks to be designed with hard limits in mind, and the assignment of tasks to computers/processors to be likewise mindful. Presumably they approach the problem from that end for a reason. – dmckee --- ex-moderator kitten Feb 09 '19 at 05:02
  • 1
    @JayElston as a comment on how the question can be improved, can you, please, add an example of a usage that you have in mind? What kind of resources were you thinking of holding for a limited time? If you add that information to your question, you might attract answers which would better address the question of how one can accomplish it. – Dmitry Rubanovich Feb 10 '19 at 06:36
  • I know I am getting a lot of comments about how this "can never work". I agree that this pattern is not for every mutex in general. But there are certain types of resources where the amount of time you get with it is limited. I have edited the question a bit. – Jay Elston Feb 10 '19 at 17:25

5 Answers5

33

This can't work, and it will never work. In other words, this can never be made. It goes against all concept of ownership and atomic transactions. Because when thread acquires the lock and implements two transactions in a row, it expects them to become atomically visible to outside word. In this scenario, it would be very possible that the transaction will be torn - first part of it will be performed, but the second will be not.

What's worse is that since the lock will be forcefully removed, the part-executed transaction will become visible to outside word, before the interrupted thread has any chance to roll-back.

This idea goes contrary to all school of multi-threaded thinking.

SergeyA
  • 61,605
  • 5
  • 78
  • 137
  • 5
    Some mechanisms for updating shared resources, such as compare-and-swap, can handle "rollbacks" without the interrupted thread having to do anything. Using locks for arbitration may offer better performance than having threads attempt updates which end up failing, but forcibly stealing an object from the thread that's updating it would merely hurt performance, not correctness. – supercat Feb 08 '19 at 23:39
  • Sergey -- this pattern does not have to work for every resource, and it should be able to work for certain types of resources. That is all that is needed for any design pattern -- that it work in the types of situations it is applicable for. – Jay Elston Feb 10 '19 at 17:30
16

I support SergeyAs answer. Releasing a locked mutex after a timeout is a bad idea and cannot work. Mutex stands for mutual exclusion and this is a rock-hard contract which cannot be violated.

But you can do what you want:

Problem: You want to guarantee that your threads do not hold the mutex longer than a certain time T.

Solution: Never lock the mutex for longer than time T. Instead write your code so that the mutex is locked only for the absolutely necessary operations. It is always possible to give such a time T (modulo the uncertainties and limits given my a multitasking and multiuser operating system of course).

To achieve that (examples):

  • Never do file I/O inside a locked section.
  • Never call a system call while a mutex is locked.
  • Avoid sorting a list while a mutex is locked (*).
  • Avoid doing a slow operation on each element of a list while a mutex is locked (*).
  • Avoid memory allocation/deallocation while a mutex is locked (*).

There are exceptions to these rules, but the general guideline is:

  • Make your code slightly less optimal (e.g. do some redundant copying inside the critical section) to make the critical section as short as possible. This is good multithreading programming.

(*) These are just examples for operations where it is tempting to lock the entire list, do the operations and then unlock the list. Instead it is advisable to just take a local copy of the list and clear the original list while the mutex is locked, ideally by using the swap() operation offered by most STL containers. And then do the slow operation on the local copy outside of the critical section. This is not always possible but always worth considering. Sorting has square complexity in the worst case and usually needs random access to the entire list. It is useful to sort (a copy of) the list outside of the critical section and later check whether elements need to be added or removed. Memory allocations also have quite some complexity behind them, so massive memory allocations/deallocations should be avoided.

Johannes Overmann
  • 4,914
  • 22
  • 38
  • I'm surprised by the "never sort a list" example. Why would that have any less of a time guarantee than any list operation/loop? – Mees de Vries Feb 09 '19 at 18:16
  • 1
    It is the weakest example in the list I admit. Sorting has square complexity when considering worst cases of useful algorithms like quicksort. And it usually requires you to lock the entire list, since for example quicksort does random accesses. It is advisable to do that outside of the critical section. Of course if you know in advance that you will just a limited amount of elements like 100 integer-like things then this is ok to do that in the critical section. I reworded that less harshly from _never_ to _avoid_, since I agree, it sounded to harsh. – Johannes Overmann Feb 09 '19 at 20:20
  • I did not specifically say this mechanism had to use the posix mutex -- only that is be a mutex-like mechanism. – Jay Elston May 17 '20 at 19:15
  • @Jay: My answer is not about posix mutexes. It is valid for all mutexes regardless of their implementation. A mutex is a concept and a contract between code pieces and programmers. Mutexes are very different from TCP sockets and other shared resources. Mutexes only exist because they enforce a hard rule. Without a hard rule (i.e. with timeouts) one does not need a mutex. – Johannes Overmann Jan 06 '21 at 21:06
5

You can't do that with only C++.

If you are using a Posix system, it can be done. You'll have to trigger a SIGALARM signal that's only unmasked for the thread that'll timeout. In the signal handler, you'll have to set a flag and use longjmp to return to the thread code. In the thread code, on the setjmp position, you can only be called if the signal was triggered, thus you can throw the Timeout exception.

Please see this answer for how to do that.

Also, on linux, it seems you can directly throw from the signal handler (so no longjmp/setjmp here).

BTW, if I were you, I would code the opposite. Think about it: You want to tell a thread "hey, you're taking too long, so let's throw away all the (long) work you've done so far so I can make progress". Ideally, you should have your long thread be more cooperative, doing something like "I've done A of a ABCD task, let's release the mutex so other can progress on A. Then let's check if I can take it again to do B and so on." You probably want to be more fine grained (have more mutex on smaller objects, but make sure you're locking in the same order) or use RW locks (so that other threads can use the objects if you're not modifying them), etc...

xryl669
  • 3,376
  • 24
  • 47
1

Such an approach cannot be enforced because the holder of the mutex needs the opportunity to clean up anything which is left in an invalid state part way through the transaction. This can take an unknown arbitrary amount of time.

The typical approach is to release the lock when doing long tasks, and re-aquire it as needed. You have to manage this yourself as everyone will have a slightly different approach.

The only situation I know of where this sort of thing is accepted practice is at the kernel level, especially with respect to microcontrollers (which either have no kernel, or are all kernel, depending on who you ask). You can set an interrupt which modifies the call stack, so that when it is triggered it unwinds the particular operations you are interested in.

Cort Ammon
  • 10,221
  • 31
  • 45
  • This pattern would let the application clean things up in the exception handler. The mutex would be locked until the lock_guard gets deleted. – Jay Elston Feb 10 '19 at 17:33
  • @jayelston if someone wraps the try/catch in a while loop, they can hold the resource indefinitely. This was actually a problem C# with domains. You can forcefully terminate a domain. When you do so, it repeatedly starts throwing exceptions on all threads, trying to break free of such loops. It's actually dependent on a race case to guarantee the threads stop eventually. – Cort Ammon Feb 10 '19 at 19:16
1

"Condition" variables can have timeouts. This allows you to wait until a thread voluntarily releases a resource (with notify_one() or notify_all()), but the wait itself will timeout after a specified fixed amount of time.

Examples in the Boost documentation for "conditions" might make this more clear.

If you want to force a release, you have to write the code which will force it though. This could be dangerous. The code written in C++ can be doing some pretty close-to-the-metal stuff. The resource could be accessing real hardware and it could be waiting on it to finish something. It may not be physically possible to end whatever the program is stuck on.

However, if it is possible, then you can handle it in the thread in which the wait() times out.

Dmitry Rubanovich
  • 2,471
  • 19
  • 27
  • Nope. `notify_all()` and `notify_one()` do not release any mutex. And condition variable have no timeout for releasing the mutex. – Johannes Overmann Feb 09 '19 at 20:22
  • 1
    @JohannesOvermann no, there is no timeout for releasing a mutex, the timeout is for waiting on a predicate with wait_for and wait_until. But notify does allow a waiting thread to gain a mutex. I did say that what the OP is asking for (precisely) is not possible. But the effect they want is accomplished with a combination of a wait_*(with a timeout) and a notify. Rather than be pedantic, I tried to be informative. – Dmitry Rubanovich Feb 10 '19 at 06:17