1

There is one fixed thread pool (let it be with size=100), that I want to use for all tasks across my app. It is used to limit server load.

Task = web crawler, that submits first job to thread pool.
That job can generate more jobs, and so on.
One job = one HTTP I/O request.

Problem
Suppose that there is only one executing task, that generated 10000 jobs.
Those jobs are now queued in thread pool queue, and all 100 threads are used for their execution.

Suppose that I now submit a second task.
The first job of the second task is 10001th in the queue.
It will be executed only after the 10000 jobs that the first task queued up.
So, this is a problem - I don't want the second task to wait so long to start its first job.

Idea
The first idea on my mind is to create a custom BlockingQueue and pass it to the thread pool constructor.
That queue will hold several blocking queues, one for each task.
Its take method will then choose a random queue and take an item from it.
My problem with this is that I don't see how to remove an empty queue from this list when its task is finished. This would mean some or all workers could get blocked on the take method, waiting for jobs from tasks that are finished.

Is this the best way to solve this problem?
I was unable to find any patterns for it in books or on the Internet :(

Thank you!

Edward
  • 5,942
  • 4
  • 38
  • 55
Oleg Golovanov
  • 905
  • 1
  • 14
  • 24
  • You might want to check this out: http://stackoverflow.com/questions/807223/how-do-i-implement-task-prioritization-using-an-executorservice-in-java-5. If you can make your tasks implement Comparable the solutions here would probably work well. – Mike Deck Jul 25 '12 at 03:45
  • How about using LIFO prioritized queue, i.e. the last task submitted gets the highest prio? – Svilen Jul 26 '12 at 08:57

2 Answers2

2

I would use multiple queues and draw from a random of the queues that contains items. Alternatively you could prioritize which queue should get the highest priority.

jontro
  • 10,241
  • 6
  • 46
  • 71
  • 1
    I agree, use a single pool for the top level request and another for the child requests. You might just want to limit the sizes of the two pools (ie first pool has 10, the second pool has 100) – MadProgrammer Jul 24 '12 at 23:42
  • @MadProgrammer i can't get why do i need 2 pools: one for top level requests and one for child requests. Could you explain it in more details? – Oleg Golovanov Jul 30 '12 at 19:48
  • @OlegGolovanov the main reason for two pools, is so that when you drain the second pool and any new requests are queued and waiting, the first pool can continue to process incoming requests. It also allows to supply priorities to the two pools separately – MadProgrammer Jul 30 '12 at 20:41
0

I would suggest using a single PriorityBlockingQueue and using the 'depth' of the recursive tasks to compute the priority. With a single queue, workers get blocked when the queue is empty and there is no need for randomization logic around the multiple queues.

shams
  • 3,460
  • 24
  • 24