1

I've made three classes: Reader, Writer and the Main class. The Reader class uses a thread to listen for output from the Python interpreter standard output.

The Reader class:

package main;
import java.io.BufferedReader;
import java.io.IOException;
public class Reader implements Runnable {
    private volatile BufferedReader br;
    private volatile String output;
    private Thread thread;
    private Boolean stop;
    public Reader(BufferedReader br) {
        this.br = br;
        this.output = "";
        this.thread = new Thread(this);
        this.stop = false;
    }
    public void run() {
        while(!stop) {
            this.read();
        }
    }
    private synchronized void read() {
        try {
            if(br.ready()) {
                this.output = br.readLine();
                            System.out.println(this.output);
                notify();
                wait();
            }
        }
        catch(Exception error) {
            error.printStackTrace();
        }
    }
    public synchronized String getOutput() {
        try {
            wait();
        }
        catch(Exception error) {
            error.printStackTrace();
        }
        notify();
        return this.output;
    }
    public void startListening() {
        this.thread.start();
    }
    public void close() throws IOException {
        this.stop = true;
        this.br.close();
    }
}

And here's the Writer class:

package main;
import java.io.BufferedWriter;
import java.io.IOException;
public class Writer {
    private BufferedWriter bw;
    public Writer(BufferedWriter bw) {
        this.bw = bw;
    }
    public void write(String line) {
        try {
            this.bw.write(line);
            this.bw.newLine();
            this.bw.flush();
        }
        catch(Exception error) {
            error.printStackTrace();
        }
    }
    public void close() throws IOException {
        this.bw.close();
    }
}

Finally, the Main class looks as shown below.

package main;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
public class Main {
    public static void main(String[] args) throws IOException, 
                                        InterruptedException {
        ProcessBuilder processBuilder = new ProcessBuilder("/usr/bin/script",
                                                   "-qfc", "/usr/bin/python",
                                                                "/dev/null");
        Process process = processBuilder.start();
        InputStream inputStream = process.getInputStream();
        InputStreamReader isr = new InputStreamReader(inputStream);
        BufferedReader br = new BufferedReader(isr);
        OutputStream outputStream = process.getOutputStream();
        OutputStreamWriter osw = new OutputStreamWriter(outputStream);
        BufferedWriter bw = new BufferedWriter(osw);
        Writer writer = new Writer(bw);
        Reader reader = new Reader(br);
        reader.startListening();
        writer.write("2+2");
        System.out.print(reader.getOutput());
    }
}

The only result I get are two lines of output with the first one being repeated at the beginning of the second.

Python 2.6.6 (r266:84292, Jan 22 2014, 09:42:36).
Python 2.6.6 (r266:84292, Jan 22 2014, 09:42:36). [GCC 4.4.7] on linux2

It looks like the readLine method doesn't continue reading in spite of more output remaining. Why? I'd like my program to give the result as shown in the next few lines. Thank you in advance for your help.

Python 2.6.6 (r266:84292, Jan 22 2014, 09:42:36) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 2+2
4
>>>
Luka
  • 1,718
  • 3
  • 19
  • 29
  • Perhaps the next line arrived before `getOutput` was called. – user253751 Aug 24 '14 at 13:10
  • When you reader goes into waiting state, how does it get out of it. Where does it get notified by the consumer? I think your implementation of the consumer/producer pattern is wrong. Particularly the use of wait/notify. – Edwin Dalorzo Aug 24 '14 at 13:24
  • @EdwinDalorzo: The `notify` method inside the `read` method of the `Reader` class notifies the main thread the output is ready. The `if` clause with a `ready` method before reading a line makes sure the writer has done its part. – Luka Aug 24 '14 at 14:38
  • @Luka, "The notify method inside the read method of the Reader class notifies the main thread the output is ready." No! It doesn't. The only thing that `notify()` does is to wake another thread that is in a `wait()` call on the same object. If no other thread is waiting at that exact moment, then `notify()` does not do anything at all. We call it a "lost notification" bug when thread A gets stuck in a wait() call _after_ thread B has already done the `notify()`. – Solomon Slow Aug 24 '14 at 15:13
  • @jameslarge: I see, the `notify` method wakes the other thread up, and then that thread waits to obtain the lock from the notifier. The notifier releases its lock by calling the `wait` method. Eventually, the thread which has been woken up moves on. I wasn't thorough enough in my comment. – Luka Aug 24 '14 at 15:38

2 Answers2

1

There is only one correct way to use wait() and notify().

First, there has to be an explicit condition. It could be a boolean variable, or a boolean function, but it has to be something that can be tested.

Second, there has to be a lock associated with the condition, and any code that can change the condition must only do so while the lock is locked.

Finally, the thread that wait()s, must do so in a loop.

It looks like this:

// lock is private so we don't have to worry about outside code messing with our synchronization.
private final Object lock = new Object();

// condition is private so we don't have to worry that outside code could break the contract
// (i.e., change the condition without locking the lock.)
private boolean condition = false;

// called by thread A
void waitForIt() {
    synchronized(lock) {
        while (! condition) {
            // At this point condition is false.  We don't have to worry about a "lost
            // notification" because this thread holds the lock, and will continue to 
            // hold it until it is in the wait() call, ready to be notified.
            lock.wait();
            // condition might *not* be true at this point because thread C might have
            // set it false in the interval between when thread B called notify(), and
            // when this thread finally re-acquired the lock and returned from wait().
        }
        // condition *is* guaranteed to be true here because we've tested it, and
        // we have the exclusive lock.
        doSomethingThatRequiresConditionToBeTrue();
    }
    // condition is no longer guaranteed true after leaving the synchronized block.
}

// called by thread B
void makeItHappen() {
    synchronized(lock) {
        if (! condition) {
            doSomethingThatMakesConditionTrue();
            condition = true;
            lock.notify();
        }
    }
}

// called by thread C
void reset() {
    synchronized(lock) {
        doSomethingThatMakesConditionFalse();
        condition = false;
    }
}

With this design, the notification can not be "lost" if thread B calls makeItHappen() before thread A calls waitForIt(). The waitForIt() function doesn't just blindly wait: It only waits when the condition is explicitly false.

Solomon Slow
  • 25,130
  • 5
  • 37
  • 57
  • How does the thread which calls the `makeItHappen` method release its lock? – Luka Aug 24 '14 at 15:58
  • @Luka, The thread releases the lock by leaving the `synchronized` block. – Solomon Slow Aug 24 '14 at 16:07
  • How would you write down the specific solution to my problem? – Luka Aug 24 '14 at 16:38
  • I would get rid of the reader thread. If I understand correctly, you want each reader.getOutput() call to block the caller (i.e., the main thread) until the Python program has produced another line of output. If I'm right, if that's what you want, then why not just let the main thread call br.readLine()? The only reason for having separate "reader" and "writer" threads would be if the Python program itself had separate threads that were producing output and consuming input independently of one another. – Solomon Slow Aug 24 '14 at 16:54
  • The problem with letting the `readLine` method operate inside the main thread is that it blocks the program because `readLine` waits for the `EOS` character which doesn't appear before closing the writer. The length of the output before closing the stream is unknown so I don't know how many times to call `readLine`. A `while` loop would not be an option since it would block the program. – Luka Aug 24 '14 at 17:41
  • @Luka, OK, so how is adding a second thread supposed to help? Do you expect readLine() to behave differently in the "reader" thread? If so, why? Maybe I misunderstand what your reader thread is supposed to do. To me, it looks as if it is supposed to operate in sync with the main thread---reading a line, waiting for the main thread to take it, reading another line,... Any time I see two threads operate in sync with one another, I look to see if I can simplify the program by using one thread do both jobs. (Maybe I can't if the threads are used to simulate co-routines). – Solomon Slow Aug 24 '14 at 18:05
  • By using a separate reading thread, the main thread can be used to add more write statements, provide new output for the reader and therefore unblock the program until the buffered writer is closed. That's what I hoped to implement. Putting everything inside the main thread will result in getting stuck inside a while loop reading until EOS has been encountered. Here's the first part of my problem with an answer that will help you understand what I'm currently after: http://stackoverflow.com/questions/25041529/how-to-run-python-interpreter-and-get-its-output-using-java. – Luka Aug 24 '14 at 18:28
1

As indicated by an answer below you've used synchronized, notify, wait, and volatile in ways that don't do what you think they're doing.

Rather than provide you with a completely fresh set of code I'm using to use logging statements to show you what your code is currently doing, then provide a minimum set of changes to make your current code work.

Moreover, looking at your CPython interactive shell output, as it's both CPython 2.6.6 and on a Red Hat distro I'm guessing you're using CentOS. Hence I've tested the below on:

  • a VirtualBox virtual machine running CentOS 6.5, provisioned via Vagrant,
  • CPython 2.6.6
  • OpenJDK Java 1.7.0 r65

So, let's get started. I sprinkled a lot of System.out.println calls everywhere; one for each function entry, function exit, and decision (e.g. if statements). I'm not going to post the full output but here is a summarised version:

Reader startListening() entry
Writer write() entry. line: 2+2
Writer write() exit.
Reader getOutput() entry
Reader getOutput() wait...
Reader run() entry
Reader read() entry
Reader read() exit
<snip - lots and lots of empty read() entry and exits...>
Reader read() br is ready, readLine...
Python 2.6.6 (r266:84292, Nov 22 2013, 12:16:22)
Reader read() notify...
Reader read() wait...
Reader getOutput() notify...
Reader getOutput() is returning: Python 2.6.6 (r266:84292, Nov 22 2013, 12:16:22)
Python 2.6.6 (r266:84292, Nov 22 2013, 12:16:22) Reader read() exit
Reader read() entry
Reader read() br is ready, readLine...
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Reader read() notify...
Reader read() wait...

This would be easier to read with a proper logging library that prepends thread names or IDs, but since we can guess what threads are involved let's just unpack this:

  • The main thread runs the process
  • The main thread writes "2+2" immediately to it, not waiting for it to be ready to read input.
  • Before the Reader thread has started the main thread calls into Reader getOutput(). This instance method is synchronized and hence the main thread, when it calls wait(), blocks, releases the Reader instance monitor lock, and waits for someone to notify it.
  • The Reader thread starts, as we can see with Reader run() entry.
  • Slowly the process catches up. CPython takes a long time to start! This explains all the empty Reader read() calls.
  • Eventually CPython starts up.
  • The Reader thread calls br.readLine(), prints out one line of output, then calls notify(). Interesting! What lock do we have that we're notifying on? The instance lock grabbed by the synchronized keyword, i.e. the Reader instance! We know that the main thread is the only thread currently wait()'ing on the Reader object as a monitor, so this call causes the main thread to wake up.
  • At this point note that the Reader thread has printed out one line of output. This is the first line of output from CPython, the "Python 2.6.6" line.
  • At the Reader read() notify point, ask yourself: what is the main thread doing? We notify()'d on the Reader instance, so surely the main thread is running? No! The JavaDoc for notify() is quite clear: "The awakened thread will not be able to proceed until the current thread relinquishes the lock on this object.". So at this point the Reader thread told the main thread "Wake up, but wait until I relinquish the Reader instance monitor".
  • The Reader thread calls wait(), so relinquishes the Reader instance monitor object and goes to sleep. This causes the main thread to resume execution.
  • The main thread calls notify(). By the same logic as before this wakes up the Reader thread but the Reader thread may only resume once the main thread relinquishes the Reader instance monitor.
  • The main thread's call to getOutput() is finished. This returns the first line "Python 2.6.6" again. This explains why this line is printed out twice.
  • The main thread stops running. End of story? No! Remember the Reader thread resumes execution. Since the Reader thread is not a daemon thread the JVM cannot exit.
  • The Reader thread prints out the next line of execution.
  • The Reader thread calls notify(). This call is inconsequential, since the main thread is no longer wait()'ing on this lock.
  • The Reader thread calls wait(). Since the main thread is no longer around to notify it back awake the program blocks forever.

That took a while to debug! What did we learn? Multithreaded programming is really hard. And we only had two threads! In this situation the complexity came about through the odd usage of synchronized methods and wait and notify on synchronized methods.

We know something else. There is no logical reason for this program to print out all the lines of output from a program. This is because threads have not been used effectively in this program.

Before I continue I should emphasise that writing an expect-like program in Java is non-trivial. I would strongly urge you to re-use existing libraries that do this, either expect directly or some other Java clone that someone has already written.

But let's say we wanted to fix the program as-is in the most direct way possible. I think this is a great learning opportunity for you as a exercise in using threads in Java. As a starting point ask yourself: what do you want? We want to be able to send "2+2" to a running Python proccess and not force the main thread to block waiting for output from the process. This suggests using a distinct thread whose sole responsibility is to read output from a stream. The main thread is then free to get a line of output from the reader thread whenever it wants to, or figure out if there's currently no output available.

How do we know when the process is finished? Certainly Reader will have no idea. It's just reading from a stream, from its perspective the stream is endless and if empty the reader will block reading it! To keep things simple let's say it is the Main class's responsibility to know when the process; when it does the Main class will read all the remaining output, print it, then exit.

From the way we've written this description it's clear that the Reader should just read as frequently and as much as possible. The Reader should have no idea that the Main thread exists. Instead, the Reader thread should put its results into some safe place. Then the Main thread can read from this safe place as frequently/infrequently or as much/as little as the Main thread wants. It seems like a simple and direct solution would be to store this output into a concurrent queue of some sort. The Reader thread would add to the head and the Main thread would pop from the tail! Using a concurrent queue delegates the nasty business of synchronising threads to a thread-safe collection, and neatly isolates the Reader thread from knowing who is reading from it.

Main.java:

package main;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.OutputStreamWriter;

public class Main {
    public static boolean isProcessFinished(Process process) {
        try {
            process.exitValue();
        } catch (IllegalThreadStateException e) {
            return false;
        }
        return true;
    }

    public static void printAllOutput(Reader reader) {
        String line;
        while ((line = reader.getOutput()) != null) {
            System.out.println(line);
        }
    }

    public static void main(String[] args) throws IOException, InterruptedException {
        ProcessBuilder processBuilder = new ProcessBuilder(
                "/usr/bin/script", "-qfc", "/usr/bin/python", "/dev/null");
        Process process = processBuilder.start();
        InputStream inputStream = process.getInputStream();
        InputStreamReader isr = new InputStreamReader(inputStream);
        BufferedReader br = new BufferedReader(isr);
        OutputStream outputStream = process.getOutputStream();
        OutputStreamWriter osw = new OutputStreamWriter(outputStream);
        BufferedWriter bw = new BufferedWriter(osw);
        try (
            Writer writer = new Writer(bw);
            Reader reader = new Reader(br);
        ) {
            reader.startListening();
            writer.write("2+2");
            writer.write("exit()");
            while(!isProcessFinished(process)) {
                // We don't block here. Maybe we're doing something
                // complicated at the same time. Maybe we're a web
                // server. Maybe...
                printAllOutput(reader);

                // Without this the java process hits 100% CPU.
                // This is because we're furiously checking if the
                // process has exited and looking for output. Java's
                // process handling API isn't so hot; consider using
                // another library?
                Thread.sleep(100);
            }

            // Yes, the process has finished, but there may still be output
            // to be read from the BufferedReader. Let's grab any dog-ends.
            printAllOutput(reader);
        }
    }
}

Reader.java:

package main;
import java.io.BufferedReader;
import java.io.IOException;
import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.atomic.AtomicBoolean;

public class Reader implements Runnable, AutoCloseable {
    private BufferedReader br;
    private Thread thread;
    private Queue<String> output = new ConcurrentLinkedQueue<>();
    private AtomicBoolean stop = new AtomicBoolean();

    public Reader(BufferedReader br) {
        this.br = br;
        this.thread = new Thread(this);
    }
    public void run() {
        System.out.println("Reader run() entry");
        while(!stop.get()) {
            try {
                // Why not just avoid the ready() call and block
                // on readLine()? We want the Main thread to be able
                // to stop the Reader thread at any time. Hence the
                // Reader thread needs to periodically check to see
                // if it has been requested to stop, and moreover
                // the Reader thread cannot block.
                while (br.ready()) {
                    output.add(br.readLine());
                }
            } catch (IOException e) {
                e.printStackTrace();
                stop.set(true);
            }
        }
        try {
            br.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
        System.out.println("Reader run() exit");
    }
    public String getOutput() {
        return output.poll();
    }
    public void startListening() {
        System.out.println("Reader startListening() entry"); 
        this.thread.start();
    }
    public void close() throws IOException {
        // Note that the close() method is called by the Main thread
        // and merely sets a flag. This flag will eventually be read
        // by the Reader thread which will then close the
        // BufferedReader.
        this.stop.set(true);
    }
}

Writer.java (unchanged except implements AutoCloseable)

package main;
import java.io.BufferedWriter;
import java.io.IOException;
public class Writer implements AutoCloseable {
    private BufferedWriter bw;
    public Writer(BufferedWriter bw) {
        this.bw = bw;
    }
    public void write(String line) {
        System.out.println("Writer write() entry. line: " + line);
        try {
            this.bw.write(line);
            this.bw.newLine();
            this.bw.flush();
        }
        catch(Exception error) {
            error.printStackTrace();
        }
        System.out.println("Writer write() exit.");
    }
    public void close() throws IOException {
        this.bw.close();
    }
}

And the output:

Reader startListening() entry
Writer write() entry. line: 2+2
Writer write() exit.
Writer write() entry. line: exit()
Writer write() exit.
Reader run() entry
Python 2.6.6 (r266:84292, Nov 22 2013, 12:16:22)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 4
>>>
Reader run() exit

As an exercise for the reader - how would you genuinely interact with the Python process? You would need a way of sending input to it, then waiting for the ">>>" from it to indicate that it is ready to receive more input. What if you execute a command, e.g. "print '>>>', that contains the prompt? Well then surely you would judge whether the interpreter is ready for input by pressing ENTER and then seeing if it outputs ">>>". But what if it's currently running a command and you press ENTER, surely that will distort the output?

Slowly but surely you're re-implementing expect. Fun exercise, I've done a limited version of expect in C but it is non-trivial.

Asim Ihsan
  • 1,501
  • 8
  • 18
  • 1
    It took me a while to grasp this completely. Thank you for such a detailed answer. – Luka Aug 29 '14 at 08:11
  • @Luka you're welcome. Multithreaded programming is really hard, try to stay as high level as possible. In this respect Java comes with lots of great libraries. If you have more questions please post comments / ask questions. I also learned a lot too from this question! – Asim Ihsan Aug 30 '14 at 09:20