6

In a typical JAVA application, one configures a global ExecutorService for managing a global thread pool. Lets say I configure a fixed thread pool of 100 threads:

ExecutorService threadPool = Executors.newFixedThreadPool(100);

Now lets say that I have a list of 1000 files to upload to a server, and for each upload I create a callable that will handle the upload of this one file.

List<Callable> uploadTasks = new ArrayList<Callable>();
// Fill the list with 1000 upload tasks

How can I limit the max number of concurrent uploads to, lets say, 5?

if I do

threadPool.invokeAll(uploadTasks);

I dont have control on how many threads my 1000 tasks will take. Potentially 100 uploads will run in parallel, but I only want max 5. I would like to create some sort of sub-executor, wich uses a subset of the threads of the parent executor. I dont want to create a new separated executorService just for upload, because i want to manage my thread pool globally.

Does any of you know how do to that or if an existing implementation exists? Ideally something like

ExecutorService uploadThreadPool = Executors.createSubExecutor(threadPool,5);

Many thanks,

Antoine

  • Can you take a look at http://stackoverflow.com/q/19819837/2231632 and see if that applies to your case? And you can't use a Semaphore or a latch of some sort for your upload logic alone? – Praba May 22 '14 at 13:54
  • The question you mention is more about rate limitation. I could use a semaphore in my client code, was wondering if there is an existing facility for doing that. – lambdacalculus May 22 '14 at 14:07
  • 1
    I would argue that a typical Java application has numerous thread-pools, whose configuration matches their use case. So I would expect your application to have a separate thread pool of 5 threads for that purpose. – Duncan Jones May 22 '14 at 14:10
  • 2
    I understand your point, but wouldnt it be better if this separate 5 thread pool was a sub-threadpool of a globally capped thread pool that you can control? makes sense only to me? – lambdacalculus May 22 '14 at 16:03

1 Answers1

1

In a typical JAVA application, one configures a global ExecutorService for managing a global thread pool.

I'm not doing this, but maybe i'm atypical. :-) Back to your question:

As having a guard (possibly using a Semaphore inside your Callables will only clutter your global Executor which tasks waiting for each other, you'll have to either

  • use some external logic which ensures only 5 jobs are running at any time, which could in itself be a Callable submitted to your one Executor. This could be done by wrapping your download jobs with some logic which will pull the next job from a queue (containing the URLs or whatever) once one job is completed. Then you submit five of those "drain download queue" jobs to your Executor and call it done. But atypical as I am, I'd just
  • use a separate Executor for the downloads. This also gives you the ability to name your threads appropriately (in the Executors ThreadFactory), which might help with debugging and makes nice thread dumps.
Waldheinz
  • 10,399
  • 3
  • 31
  • 61
  • I guess I would go with your first proposition, was wondering if there was some existing facility for doing this, something like `threadPool.createSubThreadpool(5)` – lambdacalculus May 22 '14 at 13:47
  • 1
    May I ask why you want only one `Executor` in your program? – Waldheinz May 22 '14 at 13:52
  • 1
    Because if each component in your application uses his own thread pool, you dont have any visibility on the total number of threads your program will spawn. You could end up exhausting resources without knowing it. – lambdacalculus May 22 '14 at 13:54
  • 1
    @user3665244 I think you could make one or two exceptions to that rule without growing confused about total numbers of threads. Particularly if you centralise the configuration data that defines the number of threads in those pools. – Duncan Jones May 22 '14 at 14:14
  • @Duncan Yes that would mitigate the risks, but its still different from a global thread pool. Suppose I have one particular big thread pool that I use only once per week, I would "steal" those threads from the rest of the app most of the time. – lambdacalculus May 22 '14 at 14:20
  • @lambdacalculus If you used a [cached thread pool](http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/Executors.html#newCachedThreadPool--), unused threads are killed after 60 seconds. So you could use a 5-thread cached pool for this task. – Duncan Jones May 22 '14 at 14:22
  • @Duncan True, but you still dont have a hard limit on that total number of threads used, and you dont share all threads amongst all subsystems – lambdacalculus May 22 '14 at 14:43
  • @Duncan Okay, you were right, trying to manage a global thread pool is a bit silly, thanks :-) – lambdacalculus Nov 26 '14 at 12:56
  • I think it is useful to have a sub thread pool to ensure fairness. A subtask will steal only upto n threads from a global pool without the additional latency overhead of creating new on demand threads using a cached thread pool. I think what the OP wants is to create a work stealing pool with predefined parallelism and a thread pool factory with cached threads. – Rohit Banga Nov 19 '17 at 03:16
  • I was also having a similar requirement of using a subset of threadpool but couldn't find any solution. Here is the approach I followed, I created an application level Semaphore allowing say 50 number of consequent access. I have wrapped my functionality which requires threadpool with this semaphore. Instead of using a global threadpoolexecutor, I am using Executors.newFixedThreadPool(10). After my concurrent request is completed I am releasing the lock on semaphore – Farhan May 28 '21 at 10:01