5

I was wondering if somebody could point me to a simple equivalent of python's multiprocessing module in java.

I have a simple parallel processing scenario (where no 2 processes interact): Take a data set and split it into 12 and apply a java method to the 12 datasets, collect results and join them in a list of some sort with the same ordering.

Java being a "pro" language appears to have multiple libraries and methods - anyone who can help this java newbie get started?

I would like to do this with minimal of coding - as i said my requirement is pretty straightforward.

Update: how to do multiprocessing in java, and what speed gains to expect?

This seems to indicate threads is the way to go. I expect I have no choice but wade into a bunch of locks (pun unintended) and wait for my ship to sail. Simple examples are welcome nevertheless.

Community
  • 1
  • 1
pythOnometrist
  • 6,531
  • 6
  • 30
  • 50
  • You mean like http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/package-summary.html ? – CBredlow Aug 19 '13 at 23:48
  • Possibly - With the multiprocessing module i don't have to think about threads and stuff. I simply map my function across the chunks of data in a list. The concurrent util here seems more complex at least to me. Perhaps there is an example for a simple application that will get me started. – pythOnometrist Aug 19 '13 at 23:53
  • http://stackoverflow.com/questions/4208208/java-util-concurrent-examples-tutorial-and-code this was helpful too - but if there is anything simpler out there would appreciate it. – pythOnometrist Aug 19 '13 at 23:58
  • http://ricardozuasti.com/2012/java-concurrency-examples-parallel-data-processing/ gets me partway there using the concurrent class. – pythOnometrist Aug 20 '13 at 00:07

1 Answers1

4

There's no exactly-compatible class, but ExecutorService gives you everything you need to implement it.

In particular, there's no function to map a Callable over a Collection and wait on the results, but you can easily build a Collection<Callable<T>> out of a Callable<T> and Collection<T>, then just call invokeAll, which returns you a List<Future<T>>.

(If you want to emulate some of the other functions from multiprocessing.Pool, you will need to loop around submit instead and build your own collection of things to wait on. But map is simple.)

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • That is exactly what I needed - I'll test it tomorrow and may get back with some questions. – pythOnometrist Aug 20 '13 at 00:39
  • @pythOnometrist: If you want to do anything non-trivial, I'd suggest learning how executors and futures work, instead of trying to build the Python `multiprocessing` API on top of them. You might want to play with Python's own `concurrent.futures` library (or [the backport](https://pypi.python.org/pypi/futures) if you're on 2.x) to get the hang of the ideas first. A lot of code is actually simpler that way anyway. But also, if you have to use Java, you'll be a lot happier writing idiomatic Java than trying to use Java as a very bad Python… – abarnert Aug 20 '13 at 00:51
  • I do not understand. How is this multiprocessing? Isn't `ExecutorService` made for `threads` only? – Kallzvx Dec 11 '20 at 21:00