It does sound like a producer/consumer construct. You seem to kind of have it the wrong way around, the IO should be driving the algorithm. Since you stay very abstract in what your code actually does, I'll need to stick to that.
So let's say your "distributed algorithm" works on data of type T
; that means that it can be described as a Consumer<T>
(the method name in this interface is accept(T value)
). Since it can run concurrently, you want to create several instances of that; this is usually done using an ExecutorService
. The Executors
class provides a nice set of factory methods for creating one, let's use Executors.newFixedThreadPool(parallelism)
.
Your "IO" thread runs to create input for the algorithm, meaning it is a Supplier<T>
. We can run it in an Executors.newSingleThreadExecutor()
.
We connect these two using a BlockingQueue<T>
; this is a FIFO collection. The IO thread puts elements in, and the algorithm instances take out the next one that becomes available.
This makes the whole setup look something like this:
void run() {
int parallelism = 4; // or whatever
ExecutorService algorithmExecutor = Executors.newFixedThreadPool(parallelism);
ExecutorService ioExecutor = Executors.newSingleThreadExecutor();
// this queue will accept up to 4 elements
// this might need to be changed depending on performance of each
BlockingQueue<T> queue = new ArrayBlockingQueue<T>(parallelism);
ioExecutor.submit(new IoExecutor(queue));
// take element from queue
T nextElement = getNextElement(queue);
while (nextElement != null) {
algorithmExecutor.submit(() -> new AlgorithmInstance().accept(nextElement));
nextElement = getNextElement(queue);
if (nextElement == null) break;
}
// wait until algorithms have finished running and cleanup
algorithmExecutor.awaitTermination(Integer.MAX_VALUE, TimeUnit.YEARS);
algorithmExecutor.shutdown();
ioExecutor.shutdown(); // the io thread should have terminated by now already
}
T getNextElement(BlockingQueue<T> queue) {
int timeOut = 1; // adjust depending on your IO
T result = null;
while (true) {
try {
result = queue.poll(timeOut, TimeUnits.SECONDS);
} catch (TimeoutException e) {} // retry indefinetely, we will get a value eventually
}
return result;
}
Now this doesn't actually answer your question because you wanted to know how the IO thread can be notified when it can continue reading data.
This is achieved by the limit to the BlockingQueue<>
which will not accept elements after this has been reached, meaning the IO thread can just keep reading and try to put in elements.
abstract class IoExecutor<T> {
private final BlockingQueue<T> queue;
public IoExecutor(BlockingQueue<T> q) { queue = q; }
public void run() {
while (hasMoreData()) {
T data = readData();
// this will block if the queue is full, so IO will pause
queue.put(data);
}
// put null into queue
queue.put(null);
}
protected boolean hasMoreData();
protected abstract T readData();
}
As a result during runtime you should at all time have 4 threads of the algorithm running, as well as (up to) 4 items in the queue waiting for one of the algorithm threads to finish and pick them up.