1

I am making a program that distributes a task. I have arraylist of communicator objects like so:

ArrayList<Workers>

I am working my way through a file, dividing it into fixed size chunks and dispatching to the various Workers. I am using an iterator to pass the chunks to the workers evenly. Usually there are more chunks than workers so i need to loop around and around my workers. How do i do this, my current solution uses an iterator like so.

    private Worker getNextWorker() {
    if (workerIterator == null)
        workerIterator = workers.iterator();

    if (!workerIterator.hasNext())
        workerIterator = workers.iterator();

    return workerIterator.next();
}

I synchronised the method as well as the methods modify the arraylist however this doesnt make it safe as another thread can come in and modify the collection between iterator calls. Therefore i syncronised the entire file splitting process to make it one large atomic statement.

1) Have i missed anything?

2) Is there another, perhaps better way i get can this loop around functionality.

Luke De Feo
  • 2,025
  • 3
  • 22
  • 40
  • 2
    You might be better off having the workers _take_ the tasks rather than giving the tasks to the workers. That said, there are several preexisting solutions in the JDK you can use for this sort of thing: `Executors` and `ExecutorService`, `CompletionService`... – Louis Wasserman Mar 18 '13 at 22:35
  • Why not an ExecutorService? – Lee Meador Mar 18 '13 at 22:39
  • the problem with that is the files i am dealing with are big and the reason i am splitting into chunks is to allieviate memory concerns. this would however be perfect if i could somehow limit the size of the pool of work and it blocks more work coming in untill a worker takes something out. Is this possible? – Luke De Feo Mar 18 '13 at 23:13

2 Answers2

2

I would suggest you not to re-invent the wheel and use BlockingQueue combined with ThreadPoolExecutor for such purpose.

Sergey_Klochkov
  • 269
  • 2
  • 5
  • Thanks will look into this, i figured this must exist – Luke De Feo Mar 18 '13 at 22:52
  • How would i use a blocking queue to achieve this, one way i can think of is to have the workers in a queue, i take one out, read the chunk of the file, send and then place that worker at the back of the queue. Its not really how the design pattern goes but it might work?? – Luke De Feo Mar 18 '13 at 23:09
  • Your should put chunks of the file into the queue and process them via Executor. [article about Blocking Queue and Executor](http://howtodoinjava.com/2012/10/20/how-to-use-blockingqueue-and-threadpoolexecutor-in-java/) – Sergey_Klochkov Mar 18 '13 at 23:12
  • thanks i am starting to realise now :) one problem, ill c n p from above: the problem with that is the files i am dealing with are big and the reason i am splitting into chunks is to allieviate memory concerns. this would however be perfect if i could somehow limit the size of the pool of work and it blocks more work coming in untill a worker takes something out. Is this possible? – Luke De Feo Mar 18 '13 at 23:17
  • Yes, it is possible. Your queue will be created with some capacity, so there will be only part of your file in queue, while other parts will be waiting to be processed. You can view [this stackoverflow question for details](http://stackoverflow.com/questions/2001086/how-to-make-threadpoolexecutors-submit-method-block-if-it-is-saturated) – Sergey_Klochkov Mar 18 '13 at 23:22
1

You can start workers threads (without an executor), and have them take elements from a bounded blocking queue. While you read the file, you put chunks in the queue. When the queue is full, the call to put will block until a worker takes one task from the queue. If the queue is empty, the workers will wait until a task is put in the queue. When you are done with processing, you can interrupt the worker threads.

Alternatively, you may use a ThreadPoolExecutor with a bounded blocking queue and CallerRunsPolicy. This way, if the queue is not full, tasks will be submited for execution. If the queue is full, the caller thread will execute the task (which gives time to the workers to process). With this approach you will have at most number_of_threads+queue_capacity chunks, but some worker threads may be idle while the main thread is processing.

Javier
  • 12,100
  • 5
  • 46
  • 57