8

The service I'm working on uses a Future to run multiple tasks in parallel; each task can take up to a minute to complete. However, it seems the external lib is buggy, since in some occasions (2% of the time) it doesn't return. In those cases I would like to give a 2-minute wait time, and if it hasn't returned, I would like to kill the future and re-schedule again later (it will succeed eventually).

How do I kill the Future?

  private void run() {
    ExecutorService queue = Executors.newFixedThreadPool(1);

    Future<Integer> f = queue.submit(new MyTask());
    Thread.sleep(500);

    try {
      Integer r = f.get(120, TimeUnit.SECONDS);
    } catch (InterruptedException | ExecutionException | TimeoutException e) {
      e.printStackTrace();
      f.cancel(true);
    }

    // Bad future still running here and I need it dead.

  }

  private class MyTask implements Callable<Integer> {
    private ExternalLibrary extlib = new ExternalLibrary();

    @Override
    public Integer call() throws Exception {
      // step 1 - do a few things

      // step 2 - process data
      Integer val = this.extlib.doSomething(); // here's the problem!

      // step 3 - do other things

      return val;
    }

  }

I can see the external lib running and consuming CPU (for 24 hours)... doing nothing. It's a simple task that should never take more than 60 seconds to complete its work.

So far, I'm killing the whole JVM once a day to get rid of this issue, but I'm sure there must be a better way. I wonder how app servers (Tomcat, JBoss, Weblogic, etc.) do it with rogue processes.

Joe DiNottra
  • 836
  • 8
  • 26
  • 1
    Cancellation of futures is cooperative, and given your task is not actually cancellable, cancellation does nothing. – Mark Rotteveel Mar 12 '22 at 16:46
  • Since the external library is not cooperating, what options do I have? – Joe DiNottra Mar 12 '22 at 16:48
  • 4
    Given the task is even ignoring interruption, it's either a native library (interruption is only within Java), or doing something that doesn't check for interruption, or it was written to actively ignore interruptions (i.e. continue on as if nothing happened). I don't think you can do anything in this case beyond to switching to another library or filing a bug report with whoever maintains this library. – Mark Rotteveel Mar 12 '22 at 16:52
  • You might be looking for timeouts. Have a look at https://stackoverflow.com/questions/71322315/java-8-mulithreading-how-to-achieve-parallelism-alongwith-timeout-for-individua – Sachin Mar 12 '22 at 17:23
  • @Sachin That wouldn't stop the external library from doing work, it will complete the future exceptionally, but the library will continue doing whatever it's doing. – Mark Rotteveel Mar 12 '22 at 17:29
  • @MarkRotteveel `Thread.stop()` is doing the trick! I read it shouldn't be used, but it seems it's the best solution [so far] to this problem. – Joe DiNottra Mar 12 '22 at 19:14
  • @JoeDiNottra you should try Thread.interrupt() first; if that doesn't work then Thread.stop() can be used as a last resort – Charlie Mar 12 '22 at 19:24
  • 1
    @Charlie The OP is using interruption, that is what the `true` in `cancel(true)` does. – Mark Rotteveel Mar 13 '22 at 09:09
  • 1
    @JoeDiNottra You really shouldn't use `Thread.stop`, it can cause deadlocks and other problems. – Mark Rotteveel Mar 13 '22 at 09:09

6 Answers6

4

Even if you could kill the future hanging in the buggy library, this does likely not solve your problem. The library might still have acquired some resource which will not be properly clean up. This might be memory allocations, open file handles or even monitors leaving some internal data structures in an inconsistent state. Eventually you will likely be back at the point where you have to restart your JVM.

There's basically two options: Fix or isolate it.

  1. Fix: try to get the library fixed. If this is not possible,
  2. isolate: isolate the library into a external service your application depends on. E.g. implement a REST API for calling the library and wrap everything up into a Docker image. Automate restarting of the Docker container as needed.
michid
  • 10,536
  • 3
  • 32
  • 59
  • 1
    Option #2 sounds Interesting. Maybe I could even run a separate JVM just for that process. It could be overkill, but could add stability. – Joe DiNottra Mar 19 '22 at 01:58
2

As others have mentioned, stopping a Future is cooperative, meaning, the thread running async must respond to cancellation from the waiting thread. If the async task isn't cooperative simply invoking shutdown or shutdownNow won't be enough as the underlying TPE will just interrupt the threads.

If you have no control over extlib, and extlib is not cooperative, I see two options

  1. You can stop the thread currently running. This can cause issues if the thread being stopped currently is holding a lock or some other resource. It can lead to interesting bugs that could be hard to dissect.
  2. This could take some more work, but you could run the async task as a separate process entirely. The TPE can still run the process and, on interruption, can destroy the process. This obviously has more interesting issues like how to load the process with required input.
John Vint
  • 39,695
  • 7
  • 78
  • 108
1

If I understand your requirement correctly & based on your requirement (i.e. 1 thread), you can look for shutting down executorservice in 2 phases, code is available in java doc of executorservice:

try {
      Integer r = f.get(120, TimeUnit.SECONDS);
    } catch (InterruptedException | ExecutionException | TimeoutException e) {
      e.printStackTrace();
      //f.cancel(true);  you can omit this call if you wish. 
      shutdownAndAwaitTermination(queue);
    } ... //remaining method code


void shutdownAndAwaitTermination(ExecutorService pool) {
   pool.shutdown(); // Disable new tasks from being submitted
   try {
     // Wait a while for existing tasks to terminate
     if (!pool.awaitTermination(60, TimeUnit.SECONDS)) {
       pool.shutdownNow(); // Cancel currently executing tasks
       // Wait a while for tasks to respond to being cancelled
       if (!pool.awaitTermination(60, TimeUnit.SECONDS))
           System.err.println("Pool did not terminate");
     }
   } catch (InterruptedException ie) {
     // (Re-)Cancel if current thread also interrupted
     pool.shutdownNow();
     // Preserve interrupt status
     Thread.currentThread().interrupt();
   }
 }

Please read documentation about shutdown() , shutdownNow() how they behaves because it clearly mentions there is no 100% guarantee that tasks / executorservice will get stopped if its running.

Ashish Patil
  • 4,428
  • 1
  • 15
  • 36
  • I just tried this option and unfortunately `ExecutorService.shutdown()` and `ExecutorService.shutdownNow()` don't seem to have any effect on the running task. It keeps on running forever. Thank you anyway. – Joe DiNottra Mar 12 '22 at 22:30
  • It seems to me that your call to the external library is starting its own Thread pool. Therefore, you don't have control over it. Could you share the name of that extLib and what you are actually attempting to achieve? – 3Fish Mar 17 '22 at 08:21
  • It doesn't need to start its own thread pool if nothing that process does responds to interruption. But that is a good point, if it does start its own thread pool the OP is pretty much out of luck. – John Vint Mar 18 '22 at 19:16
1

Unfortunately if the external library is not co-operating to thread interrupts, there is nothing you can do to kill the Thread running the task managed by the ExecutorService.

An alternative that I can think of is to run the offending code as a separate process. Using ProcessBuilder and Process, your task can effectively control (or) even kill the offending process after a timeout (https://docs.oracle.com/javase/9/docs/api/java/lang/Process.html#destroyForcibly--).

Also see https://docs.oracle.com/javase/9/docs/api/java/lang/ProcessBuilder.html

0

@joe That is correct. Unless you have control over the thread and inside the thread you can't kill it.

this.extlib.doSomething();

if this line starts a thread then we need to get hold of that thread to kill it as we don't have reference to stop it.

Sachin
  • 51
  • 6
0

In your code, the call:

this.extlib.doSomething()

must be synchronous, because if it is not, the code lost sense. With that assumption, you can try:

ExecutorService executor = Executors.newSingleThreadExecutor();
Future<Integer> future = executor.submit(new MyTask());
try {
    future.get(120, TimeUnit.SECONDS);
} catch (InterruptedException | ExecutionException e) {
    e.printStackTrace();
} catch (TimeoutException e) {
    future.cancel(true);
} finally {
    executor.shutdownNow();
}

If this doesn't stop the doSomethig work is because this doSomething function is opening other threads to do the work. In that case, maybe you can check the threads that are running with:

Thread.getAllStackTraces()

And try to kill the right one...

David B.F.
  • 231
  • 3
  • 8