Terminate a thread which is running a native code

Question

In my application I have a wrapper over some native code, which is called via JNI bridge. This native code needs to be executed in separate thread (parallel processing). However the problem is that the code sometimes "hangs" so the thread needs to be terminated "by force". Unfortunately I haven't found any "delicate" method to do so: general advise is to tell the code in a thread to exit gracefully, but I can't do it with this native code (which is 3rd party code all above).

I use Java Concurrent API for task submission:

Future<Integer> processFuture = taskExecutor.submit(callable);

try {
    result = processFuture.get(this.executionTimeout, TimeUnit.SECONDS).intValue();
}
catch (TimeoutException e) {
    // How to kill the thread here?
    throw new ExecutionTimeoutException("Execution timed out (max " + this.executionTimeout / 60 + "min)");
}
catch (...) {
    ... exception handling for other cases
}

Future#cancel() will only interrupt the thread, but it will not terminate it. So I used the following trick:

class DestroyableCallable implements Callable<Integer> {

    private Thread  workerThread;

    @Override
    public Integer call() {
        workerThread = Thread.currentThread();

        return Integer.valueOf(JniBridge.process(...));
    }

    public void stopWorkerThread() {
        if (workerThread != null) {
            workerThread.stop();
        }
    }
}

DestroyableCallable callable = new DestroyableCallable();

Future<Integer> processFuture = taskExecutor.submit(callable);

try {
    result = processFuture.get(this.executionTimeout, TimeUnit.SECONDS).intValue();
}
catch (TimeoutException e) {
    processFuture.cancel(true);
    // Dirty:
    callable.stopWorkerThread();

    ThreadPoolTaskExecutor threadPoolTaskExecutor = (ThreadPoolTaskExecutor) taskExecutor;

    logger.debug("poolSize: " + threadPoolTaskExecutor.getPoolSize() + ", maxPoolSize:"
                    + threadPoolTaskExecutor.getMaxPoolSize() + ", activeCount:"
                    + threadPoolTaskExecutor.getActiveCount());
    }

    throw new ...;
}
catch (...) {
    ... exception handling for other cases
}

The questions/problems with this code:

Is it in general the right way to do so? Any other more elegant alternatives?
activeCount on task executor is not decreased, so task executor still "thinks" that thread is running
I had to add workerThread != null check to stopWorkerThread() method, as this variable turned out to be null on some case. I can't understand what are these cases...

Notes:

Native code does not consume file descriptors (sockets). Everything is passed to it as block of data and returned the same way.
Native code is CPU-intensive. Even though it guarantees to terminate, it may take a long time.

Bounty edit: The approach/suggestion to revisit the native code is clear, please do not offer it in your reply. I need pure-Java solution / workaround.

score 13 · Accepted Answer · answered Jan 13 '12 at 12:08

Java has pure options for force thread termination. It is ancient and deprecated Thread.stop() only (AFAIK). And no option for safe thread termination (what is why .stop() was deprecated, and allowed to be even not implemented by JVM implementors).

The reason is what all threads inside app shares memory and resources -- so, if you force termination of thread in some arbitrary point, you can't prove for sure what terminated thread does not left some shared memory/resources in inconsistent state. And you even can't (in general) suppose which resources are (possibly) dirty ('cos you dont know exactly at which point thread was stopped).

So, if you want some threads of your app to be able to interrupt, the only solution is to provide -- at design phase -- some notation of "savepoints" -- locations in target thread's code, which are guaranteed to not mutate shared state, so it is safe for thread to exit here. And it is exactly what Thread.stop() javadocs are telling you: the only way to interrupt thread safely is to design thread's code so it can by itself response to some kind of interrupt request. Some kind of flag, which is checked by thread from time to time.

I've trying to tell you: you can't do the thing you're asked about using java threading/concurrency. The way I may suggest you (it was given early here) is to do your job in separate process. Forcibly kill process is much more safe then thread since 1) processes are much more separated from each other and 2) OS takes care about many cleanups after process termination. Killing process is not completely safe, since there exists some kind of resources (files, for example) which is not cleared by OS by default, but in your case it seems to be safe.

So you design small separate application (may be even in java -- if your third-party lib does not provide other bindings, or even in shell-script) which only job is to make you computation. You start such process from main app, give it the job, and start watchdog. It watchdog detects timeout -- it kills process forcibly.

This is the only draft of solution. You can implement some kind of processes pool, if you want to improve performance (starting process may takes time), and so on...

Thanks fair notice. One small remark for your post: as the launched thread is executing the native code, there are no locks (unless you code this explicitly via JNI) which are shared with Java-part. So the only risky resource is opened file descriptors (which is not the case for me, as all necessary data is passed as blob and taken back as string). The only problem is how to tell `ThreadPoolTaskExecutor` that thread really died? I've looked through the code: it should really capture this case. — dma_k, Jan 13 '12 at 18:47
No locks -- ok, may be. But what about memory leaks in native code? Can you be sure native lib was designed and coded safe enough to gracefully process abnormal termination? Does it release allocated memory on all exit paths, including abnormal one? What will you do with memory, leaked in native code? Same question about some OS-level resources -- what if native lib allocate kernel-level lock? Starts several additional threads? — BegemoT, Jan 13 '12 at 20:16
I agree concerning the memory leaks on the heap (as [threads share the same heap](http://stackoverflow.com/questions/1665419)). I am not an expert, but I don't think that user thread can be interrupted in kernel mode, otherwise killing the thread would damage e.g. filesystem in case the thread was killed when writing something to disk. So unreleased kernel-level locks would be very sad. For the rest I fully support your position: process is more safe. More over JNI bridge was made by me from standalone process, so it is no effort to revert. — dma_k, Jan 13 '12 at 22:42
I think most appropriate solution would be to implement something like Apache does: master process should launch several subprocesses (e.g. 2), each of them should process X requests and die (it will be re-spawned by master process). — dma_k, Jan 13 '12 at 23:29

Tudor · Answer 2 · 2012-01-13T19:58:09.940

2

Definitely an ugly hack you have here...

First of all, thread pool threads are not meant to be tempered with individually and should generally be left to run until completion, especially not stopped with Thread.stop() which is not recommended even for normal threads.

The use of Thread.stop(), as I've said, is never encouraged and generally leaves the thread in an inconsistent state, which is probably the cause for the thread pool not seeing the thread as "dead". It might not even kill it at all.

Any idea why the native code hangs? I think the root of your problem is here, not the thread stopping part. Threads should normally run until completion whenever possible. Perhaps you can find a better implementation that works correctly (or implement something different if you wrote it).

Edit: As for point 3, it's probable that you need to declare the reference to the current thread as volatile since you are assigning it in one thread and reading it in another:

private volatile Thread workerThread;

Edit 2: I'm starting to get the idea that your JNI code only does numeric computations and does not open any handles that may remain in an inconsistent state should the thread get abruptly killed. Can you confirm this?

In that case, allow me to go against my own advice and tell you that in this case you may safely kill the thread with Thread.stop(). However, I do recommend you to use a separate thread instead of a thread pool thread in order to avoid leaving the thread pool in an inconsistent state (as you've mentioned, it does not see the thread as dead). It's also more practical, because you don't have to do all those tricks to get the thread to stop itself, because you can just call stop() directly on it from the main thread, unlike with thread pool threads.

edited Jan 13 '12 at 19:58

answered Jan 10 '12 at 16:05

Tudor

61,523
12
102
142

**Tudor**, thanks for pointing the 3rd point, +1 for that. Yes I have the idea why native code "hangs": the processing of data takes too long, and in this case I would like the data to be skipped (it is OK in my case to loose some data in sake of processing speed). What you suggest is to review the native code. Anything else *besides* that? – dma_k Jan 11 '12 at 10:45
@dma_k: Do you have any guarantees that the native code will terminate eventually? – Tudor Jan 11 '12 at 15:22
Yes, I have guarantees. But it may take days :) – dma_k Jan 12 '12 at 17:28
@dma_k: So it's not really an option to just ignore the thread after the timeout and let it finish by itself – Tudor Jan 13 '12 at 12:53
I am happy to, but the thread eats CPU, as is computation-extensive. – dma_k Jan 13 '12 at 18:37
@dma_k: Does the code open any files, sockets or anything that may remain unclosed if you kill it abruptly? – Tudor Jan 13 '12 at 18:39
As far as I am aware: no. Read my last comment to [*BegemoT* answer](http://stackoverflow.com/a/8850345/267197). – dma_k Jan 13 '12 at 18:49
@dma_k: I've made another edit to my answer. Please let me know what you think. – Tudor Jan 13 '12 at 19:58

Vladimir Šor · Answer 3 · 2012-01-12T18:16:15.007

2

You could wrap this single call to JNI method into a separate Java application and then fork another java process with java.lang.Process. Then you can call Process.destroy() to destroy that process on the OS level.

Depending on what your environment is and other considerations you might have to do some tricks to find out how to find java executable, especially if you are building some redistributable software which could be run on different platforms. Another issue would be for you the IPC, but that could be done using Process's input/output streams.

edited Jan 12 '12 at 18:16

answered Jan 11 '12 at 20:08

Vladimir Šor

119
6

Forcibly killing a process is even less elegant than stopping a thread. – Tudor Jan 11 '12 at 20:28
4

Forcibly killing a process is safe (OS takes care of it) -- which is not true for thread. So it _is_ more elegant solution, and it is, I suppose, the only solution, if third-party JNI lib is known to not produce some external resources, like files, which will not be cleared by OS on process termination. – BegemoT Jan 12 '12 at 11:17

Jay Tomten · Answer 4 · 2012-01-06T19:01:09.343

1

Since you are dealing with 3rd party code I would suggest creating a native shell application that handles calling, tracking and terminating these threads. Better yet get this 3rd party to do it for you if your licence agreement offers any kind of support.

http://java.sun.com/docs/books/jni/html/other.html

edited Jan 06 '12 at 19:01

answered Jan 06 '12 at 18:01

Jay Tomten

1,657
1
14
23

If I correctly understood, you mean that Java Threading surrenders to handle such a case? – dma_k Jan 08 '12 at 18:22

score 0 · Answer 5 · answered Jan 11 '12 at 13:02

I won't repeat all precious advices given by Tudor.... I 'll just add an alternative architectural point of you while using any queing like mechanism to handle the communication between your main Java application and the native thread launched .... This thread may be client of a broker and being notified if some special events occur (termination) and acts in consequence (stopping long running job) Of course this adds some complexity but is a quite elegant solution .. Of course if the native thread is not robust it won't change anything to the whole robustness.. One way to handle communication between native thread and a broker would be to use a STOMP like interface (many brokers Apache activemq, MQ from Oracle expose such interface) ...

HTH Jerome

Should this this broker be written in Java? If yes, could you provide a meta-logic for it? From what you've said, broker should run in yet another separate thread to handle events... — dma_k, Jan 12 '12 at 17:30

Terminate a thread which is running a native code

5 Answers5

Linked

Related