1

My program analyzes a large number of documents and occasionally gets a page that causes an infinite or very long loop. This is not analyzable in advance. I want to kill the particular page and continue with the next one (throwing away any results for the offending page). I have read SO answers such as this How to stop execution after a certain time in Java? and have written the following code:

// main program 
    private void runThread() throws InterruptedException {
        long timeout = 15000L;
        RunPageAnalyzer runPageAnalyzer = new RunPageAnalyzer(this);
        Thread t = new Thread(runPageAnalyzer); 
        long startTime = System.currentTimeMillis();
        t.start();
        while (t.isAlive()) {
            t.join(1000);
            long delta = System.currentTimeMillis() - startTime;
            LOG.debug("delta: "+delta);
            if (delta > timeout && t.isAlive()) {
                t.interrupt();
                t.join;
                break;
            }           
        }
    }

the method in the same class called by the thread

    void runActions() {
        // variable length calculation which should be abandoned if too long
    }

and the Runnable:

    class RunPageAnalyzer implements Runnable {
    private PageAnalyzerAction pageAnalyzerAction;
        public RunPageAnalyzer(PageAnalyzerAction pageAnalyzerAction) {
            this.pageAnalyzerAction = pageAnalyzerAction;
        }

        public void run() {
        try {
            pageAnalyzerAction.runActions();
        } catch (Exception e) {
            LOG.debug("Exception running thread ", e);
        }
    }

The output for a normal termination of runActions() seems OK:

    =========== page 1 =============
13863 [main] DEBUG org.xmlcml.graphics.control.page.PageAnalyzerAction  - pageActions: 24 on page 0
14863 [main] DEBUG org.xmlcml.graphics.control.page.PageAnalyzerAction  - delta: 1000
15864 [main] DEBUG org.xmlcml.graphics.control.page.PageAnalyzerAction  - delta: 2001
16864 [main] DEBUG org.xmlcml.graphics.control.page.PageAnalyzerAction  - delta: 3001
16975 [main] DEBUG org.xmlcml.graphics.control.page.PageAnalyzerAction  - delta: 3112
16975 [main] DEBUG org.xmlcml.graphics.control.page.PageAnalyzerAction  - finished page

but when the time limit is exceeded the process hangs in t.join().

    =========== page 2 =============
16975 [main] DEBUG org.xmlcml.graphics.control.page.PageAnalyzerAction  - pageActions: 24 on page 0
17976 [main] DEBUG org.xmlcml.graphics.control.page.PageAnalyzerAction  - delta: 1001
18976 [main] DEBUG org.xmlcml.graphics.control.page.PageAnalyzerAction  - delta: 2001
// ...
30976 [main] DEBUG org.xmlcml.graphics.control.page.PageAnalyzerAction  - delta: 14001
31976 [main] DEBUG org.xmlcml.graphics.control.page.PageAnalyzerAction  - delta: 15001

If I omit the t.join() then the process behaves as I would expect but I am worried that this might simply be building up huge numbers of threads that will be a problem later.

UPDATE: The answers so far have suggested that this is non-trivial (and I didn't find the standard Java examples/tutorials very helpful). The key point is that runActions() has to know explicitly that it might be interrupted. join() is not the primary problem because the threads just keep going.

FURTHER QUESTION: Do I have to insert Thread.currentThread().isInterrupted() in all the places in runActions() which I expect to be in unpredictably long loops?

Community
  • 1
  • 1
peter.murray.rust
  • 37,407
  • 44
  • 153
  • 217

3 Answers3

3

I assume here that pageAnalyzerAction.runActions(); can be interrupted (i.e. it handles interruptions by exiting fairly quickly).

If you are not comfortable with the low level thread API, you could use an executor and futures from the java.concurrent package to deal with thread management and time out policy for you:

  • the executor will handle the threads management with a thread pool, reusing them if necessary
  • the future returned on task submission can be queried with a timeout - if the task does not complete within the timeout, the future will throw a TimeOutException and you can then cancel your task

A contrived example would be:

//declare an executor  somewhere in your code, at a high level to recycle threads
ExecutorService executor = Executors.newFixedThreadPool(10); //number of threads: to be adjusted

private void runThread() throws InterruptedException {
    long timeout = 15000L;
    RunPageAnalyzer runPageAnalyzer = new RunPageAnalyzer(this);
    Future future = executor.submit(runPageAnalyzer);
    try {
        future.get(timeout, TimeUnit.MILLISECONDS);
    } catch (ExecutionException e) {
        //the runnable threw an exception: handle it
    } catch (TimeoutException e) {
        //the task could not complete before the timeout
        future.cancel(true); //interrrupt it
    }
}
assylias
  • 321,522
  • 82
  • 660
  • 783
  • +1 for reacting rapidly. I will try your answer. Does this mean I have to test for interruptions in runActions() (see @esaj) – peter.murray.rust Aug 05 '12 at 12:46
  • @peter.murray.rust yes - if you don't handle interruption in runActions(), you can't interrupt it, whatever the method you choose. – assylias Aug 05 '12 at 12:47
  • So assume that I have something like `number.factorize()` provided by a third-party library function it would have to have isThreadInterrupted() inside it? And if it didn't there is nothing I can do. (I have chosen an example which scales badly and which is unpredictable) – peter.murray.rust Aug 05 '12 at 12:59
  • @peter.murray.rust something like that I'm afraid. – assylias Aug 05 '12 at 13:09
  • thanks. Presumably if I fail to interrupt them they keep going and may or may not cause problems. And presumably stop() or suspend() - which seem to solve the problem in a pragmatic manner - are now considered harmful? – peter.murray.rust Aug 05 '12 at 13:23
  • @peter.murray.rust Yes - using stop is not a good idea and I'm not even sure that stop would work if interrupt doesn't. I don't know enough about what you are doing, but you could set a time out and when it is reached, move on to the next task. The unfinished task will continue executing in the background, using some cpu unnecessarily but at least you can move on... – assylias Aug 05 '12 at 13:27
  • @peter.murray.rust I suppose you have read the [stop javadoc](http://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#stop%28java.lang.Throwable%29) which gives more details as to why it is not a good idea to use it. – assylias Aug 05 '12 at 13:28
  • 1
    An interesting quote from [javadoc](http://docs.oracle.com/javase/7/docs/technotes/guides/concurrency/threadPrimitiveDeprecation.html) «It should be noted that in all situations where a waiting thread doesn't respond to Thread.interrupt, it wouldn't respond to Thread.stop either.» – Boris Treukhov Aug 05 '12 at 13:43
  • @assylias Yes, I've read them - which is why I wanted to check. – peter.murray.rust Aug 05 '12 at 14:03
  • 1
    @peter.murray.rust Anyway, in the worst case, if you can't do anything with the 3d party library and if every operation takes several seconds(so the process creation cost will be low), you can probably run them in separate processes and kill them if they do not respond - this should work because OS clean up resources properly when process terminates. Also I believe that in Linux separate processes are cheaper than in Windows, for example. – Boris Treukhov Aug 05 '12 at 14:09
  • Thanks, Boris. By 'process' do you mean java.exec()? – peter.murray.rust Aug 05 '12 at 14:22
  • 1
    Yes, Runtime.exec or whatever :-(, but I believe that the developers of the library(is it written in java or a native library?) should have provided some kind of cancellation policy, especially if it's a scientific calculation that may take a long time. – Boris Treukhov Aug 05 '12 at 14:36
2

It sounds like your runActions-method does not react to the interrupted-state of the thread being set. The latter join-call after calling interrupt has no timeout, and will wait indefinitely for the thread t to die. You should check for the interrupted state within your runActions-method and react by terminating the operation if the interrupted status is set (Thread.interrupted() returns true).

esaj
  • 15,875
  • 5
  • 38
  • 52
  • You are correct that runActions-method does not react to the interrupted-state of the thread being set. So I have to chooose a loop in runActions() and text everytime I go round. (The trouble is that I don't yet know where to look). +1 for what looks like a useful starting point – peter.murray.rust Aug 05 '12 at 12:45
2

There is something else which is not mentioned by the answers here. If you want to cancel I/O done from a thread, you just can't "cancel" it and expect the actual I/O to be cancelled. You basically have to handle the interruption exception in your "task" and handle it accordingly, maybe even close the socket connection. I have a small snippet dedicated to "stopping" tasks run using threads which you might find helpful (apologies if it has typos, it was written a long time back).

public class ThreadStopTest {

    public static void main(String[] args) {
        testSqlThreadStop();
    }


    private static void testSocketReadStop() {
        ExecutorService executor = Executors.newFixedThreadPool(2);
        SocketTask task = new SocketTask("http://www.yahoo.com", 80);
        Future<Integer> future = executor.submit(task);
        try {
            Integer result = future.get(1, TimeUnit.SECONDS);
            System.out.println("Computation complete; result: " + result);
        } catch(TimeoutException te) {
            future.cancel(true);
            task.cleanupAfterCancel();
            System.out.println("Computation cancelled");
        } catch(Exception e) {
            e.printStackTrace();
        }
        executor.shutdown();
    }

}


class SocketTask implements CleanableTask<Integer> {

    private final String host;

    private final int port;

    private Socket socket;

    public SocketTask(final String host, final int port) {
        this.host = host;
        this.port = port;
    }

    @Override
    public Integer call() throws Exception {
        InputStream in = null;
        // TODO: Actually update the count and cleanly handle exceptions
        int bytesRead = 0;
        try {
            this.socket = new Socket(this.host, this.port);
            in = this.socket.getInputStream();
            byte[] bytes = new byte[1000000];
            System.out.println("Started reading bytes");

            // The below behavior of waiting for a forceful close can be avoided
            // if we modify the FutureTask class (the default Future impl)
            // by passing in a CleanupHandler whose cleanup() method would be
            // invoked after invoking the `cancel` method or by making all 
            // your tasks implement a CancelledTask interface which has a 
            // `cleanupAfterCancel` method to do the same. :)
            try {
                in.read(bytes);
            } catch(SocketException se) {
                if(Thread.currentThread().isInterrupted()) {
                    System.out.println("All OK; this socket was forcefully closed");
                } else {
                    se.printStackTrace();   // something was seriously wrong
                }
            }
        } catch(Exception e) {
            e.printStackTrace();
        } finally {
            if(in != null)  in.close();
        }
        return Integer.valueOf(bytesRead);
    }

    @Override
    public void cleanupAfterCancel() {
        try {
            this.socket.close();
        } catch (IOException e) {
            e.printStackTrace();
        }        
    }

}
Sanjay T. Sharma
  • 22,857
  • 4
  • 59
  • 71