I'm trying to implement semaphore with mutex to learn more about concurrent primitives and patterns, and how to write correct concurrent programs.
Here is the resource I found: http://webhome.csc.uvic.ca/~mcheng/460/notes/gensem.pdf
Solution #1 is labeled "incorrect" on page 9. I implemented the algorithm here. Running this program several times could result in frozen threads. I did some analysis and realized that the following sequence of operations on the d
mutex could happen:
// d was initialized as locked
d.unlock(); // d is unlocked
d.unlock(); // d is still unlocked (unaffected)
d.lock(); // d is locked
d.lock(); // d is still locked (deadlock)
This will cause the last d.lock()
to deadlock.
Solution #2 addressed this issue by securing the transition from the signaling thread to the awaking thread by reusing mutex m
. I implemented this version here.
In this solution, after d.unlock()
, m
remains locked, which means subsequent
post()
operations will be blocked until m
is unlocked. Then d.lock()
is
called following m.unlock()
, making sure d
is in locked state before
allowing subsequent post()
operations to run.
Although I understand how this solution address the issue, I'm having difficulty to argue its correctness in terms of other potential issues. Is there any general rules and guidelines we can follow to ensure, argue, or even prove the correctness of programs like this?
I want to ask this question because of the next 2 solutions from the note. I implemented solution #3 and solution #4, but both are having frozen threads when tested several times. I'm not sure if it's my implementation's problem or if the solution itself is incorrect.
I'd be appreciated if you can just analyze these solutions' correctness, but I'd be more than ever like to learn a way to reason about any algorithms of this kind, and verify their correctness.