2

Here's some code to generate a deadlock:

import java.util.*;
import java.util.concurrent.CyclicBarrier;

public final class Main {
  
  public static void main(String [] args) throws Exception {
    final Object lockA = new Object();
    final Object lockB = new Object();
    final CyclicBarrier barrier = new CyclicBarrier(2);

    Thread thread1 = new Thread(() -> {
      synchronized(lockA) {
        try {
          barrier.await();
        } catch (Exception e) { // don't do this in production code :)
        }
        synchronized(lockB) {
          System.out.println("Got A then B");
        }
      }
    });
    thread1.start();
    Thread thread2 = new Thread(() -> {
      synchronized(lockB) {
        try {
          barrier.await();
        } catch (Exception e) { // don't do this in production code :)
        }
        synchronized(lockA) {
          System.out.println("Got B then A");
        }
      }
    });
    thread2.start();
    
    thread1.join();
    thread2.join();
  }
  
}

Sending SIGQUIT to the java pid detects the threads involved in the deadlock:

kill -s QUIT $java_pid
less out.txt
...
Found one Java-level deadlock:
=============================
"Thread-0":
  waiting to lock monitor 0x00007fc8b1f1cb00 (object 0x000000070fe1a5b8, a java.lang.Object),
  which is held by "Thread-1"

"Thread-1":
  waiting to lock monitor 0x00007fc8b5306eb0 (object 0x000000070fe1a5a8, a java.lang.Object),
  which is held by "Thread-0"

Java stack information for the threads listed above:
===================================================
"Thread-0":
        at Main.lambda$main$0(Main.java:18)
        - waiting to lock <0x000000070fe1a5b8> (a java.lang.Object)
        - locked <0x000000070fe1a5a8> (a java.lang.Object)
        at Main$$Lambda$1/0x0000000800c00a08.run(Unknown Source)
        at java.lang.Thread.run(java.base@16.0.1/Thread.java:831)
"Thread-1":
        at Main.lambda$main$1(Main.java:30)
        - waiting to lock <0x000000070fe1a5a8> (a java.lang.Object)
        - locked <0x000000070fe1a5b8> (a java.lang.Object)
        at Main$$Lambda$2/0x0000000800c00c30.run(Unknown Source)
        at java.lang.Thread.run(java.base@16.0.1/Thread.java:831)

Found 1 deadlock.
...

Given this information, how can I kill or interrupt one of the threads and continue progress. I understand that the state could be corrupted, but I would prefer that the application makes progress. For instance, when mysql detects a deadlock, it kills one of the queries.

I understand there are ways to kill threads from within the code as has been asked in a similar question here: How to kill deadlocked threads in Java?. However, at this point it's too late. My only option would be the CLI until we can get a new version of the application deployed. I don't have JMX enabled. I might be able to enable it if it's the only way.

joseph
  • 2,429
  • 1
  • 22
  • 43
  • 1
    Your only viable option is to kill and restart the web container. Not even `Thread.stop()` will reliably unjam a deadlocked thread. – Stephen C Sep 13 '21 at 01:01
  • That was the answer I was hoping not to hear. However, if that's the case, I guess I could continuously restart JVM when there's a deadlock until a fix is deployed. – joseph Sep 13 '21 at 01:04
  • I guess you could :-) See also https://stackoverflow.com/questions/18908012 – Stephen C Sep 13 '21 at 01:07
  • 1
    (FWIW - the reason that MySQL can handle this is ... transaction rollback. But Java is not transactional at the JVM level. For good reasons.) – Stephen C Sep 13 '21 at 01:11
  • Ah right. If I were to kill one of the threads, the state spread over multiple fields it was in the middle of modifying would be incomplete, and not undone. – joseph Sep 13 '21 at 01:12
  • 3
    Partly. And another thing is that stopping a thread means that it won't send any of the notifies (etc) that it should have sent. And that leaves other threads stuck waiting for things that won't ever happen. Even though you have (in a sense) broken the deadlock. This kind of thing is part of why `Thread.stop()` was deprecated ... in the 1990's. – Stephen C Sep 13 '21 at 01:14
  • 1
    Anyway, the Big Red Restart Button is the only practical solution. Tell 'em you need the fix deployed "soonest". – Stephen C Sep 13 '21 at 01:16

0 Answers0