0

I have a service which calls a database and performs a callback on each result.

ExecutorService service = Executors.newFixedThreadPool(10);
service.exectute(runnable(segmentID, callback)); // database is segmented

Runnable is:

call database - collect all the rows for the segment keep in memory
perform callback(segment);

Now the issue is I get a huge rows returned by database and my understanding is executor service will schedule threads whenever they are idle in I/O. So I go into Out of Memory.

Is there any way to restrict only 10 threads are running at a time and no executor service scheduling happens?

For some reason I have to keep all the rows of a segment in memory. How can I prevent going OOM by doing this. Is Executor service newFixedThreadPool solution for this?

Please let me know if I missed anything.

Thanks

harish.venkat
  • 139
  • 1
  • 1
  • 14
  • 1
    Some unknown code doing unknown things cause memory problems. Hard to help. Post your code. Show us how you "call database". Show us what the callbacks do. – JB Nizet Oct 17 '15 at 11:43
  • Please assume I want to perform huge IO inside the runnable and keep the values retrieved in memory. If I have say 10 threads doing IO and others are idle I won't go to OOM but if executor service schedules other threads while these 10 are performing IO then definitely I would go into OOM. – harish.venkat Oct 18 '15 at 09:23
  • 1
    If you submit 10 tasks or more to an executor which hs 10 threads available, the 10 threads *will* execute concurrently. That's the whole point. If you want only 2 threads executing in parallel, then create an executor with 2 threads. But anyway, if everything will stay in memory even after a task is done, the number of threads won't change anything. – JB Nizet Oct 18 '15 at 11:05

1 Answers1

1

You must use a fixed thread pool. There's a rule that you should only spawn N threads where N should be in the same order of magnitude than the number of cores in the CPU. There's a debate on the size of N, and you can read more about it here. For a normal CPU we could be talking 4,8, 16 threads.

But even if you were running your program in a cluster, which I think you are not, you can't just fetch 20k rows from a DB and pretend to spawn 20k threads. If you do, the performance of your app is going to degrade big time, because most of the CPU cycles would be consumed in context switching.

Now even with fixed thread pool, you might run into OOM exceptions anyway if the data fetched is stored in memory at the same time. I think the only solution to this is to fetch smaller chunks of data, or write the data to a file as it gets downloaded.

Community
  • 1
  • 1
Mister Smith
  • 27,417
  • 21
  • 110
  • 193
  • The acceppted answer in the linked question starts with "If your threads don't do I/O, ...". The question here is about "Huge IO", at least in the title. – Peter Zeller Mar 25 '21 at 20:24
  • "There's a rule that you should only spawn N threads where N should be in the same order of magnitude than the number of cores in the CPU." - this is incorrect in the context of IO bound work and only applies to CPU bound work. It's actually the wrong response to the given question. – Sherms Oct 13 '22 at 14:38
  • @Sherms Regardless of the nature of the work, spawning a lot of classic threads is expensive and can cause OOM exceptions and other reliability issues. That is the entire point of Kotlin coroutines/ Java green threads. – Mister Smith Dec 17 '22 at 19:33