3

I'm implementing a message passing algorithm. The messages propagate through the nodes of the graph, blocking until they have have received enough information (from other neighbours) to send a message.

The algorithm is easy to write if I put each message in its own thread and use a boost::condition to pause the thread until all the required information is available. I create many thousands of threads, but mostly only a few are active at any time. This seems to work pretty well.

My problem is, when unit testing I find that if I create more than about 32705 threads, I get

unknown location(0): fatal error in "Tree_test": std::exception: boost::thread_resource_error

and I don't know what causes this, or how to fix it.

There seems to be pleanty of memory available (Each thread only holds two pointers - the objects that the message passes between).

From this question: Maximum number of threads per process in Linux? I think the following information is relevent (although I don't really know what any of it means...)

~> cat /proc/sys/kernel/threads-max
1000000

(I increased this from 60120 - do I need to restart?)

 ~>ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 20
file size               (blocks, -f) unlimited
pending signals                 (-i) 16382
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

I tried fiddling with the pending signals (my limit is very close to 2* that number) and stack size with ulimit -S -i 8191 - (I couldn't increase it) but these changes seemed to make no effect at all)

I'm on a 64 bit Ubuntu-10-10 if that helps...

Community
  • 1
  • 1
Tom
  • 5,219
  • 2
  • 29
  • 45
  • 8
    32k threads?!!! seriously?!? – Nate Koppenhaver Apr 22 '11 at 02:36
  • @Nate Only a few run at once? Most of them are asleep... I thought the computer should be able to do a better job at waking them up than me... – Tom Apr 22 '11 at 02:42
  • 1
    @Tom: threads are cheap but not free. You're creating thousands of threads that do nothing but they do take up some resources (e.g. the scheduler needs to know about them). – John Zwinck Apr 22 '11 at 02:47
  • 1
    32,705 threads = broken design, rethink your logic. – Sam Miller Apr 22 '11 at 03:08
  • @Sam 32705 is just fine by me. The algorithm is quick and the code is both short and transparent - Its 32706 that is the problem. I have hit a system limit but I don't know which one? – Tom Apr 22 '11 at 03:10
  • @Tom you are missing the point entirely. Look at all the comments to your question and the [only answer](http://stackoverflow.com/questions/5751737/boostthread-resource-error-when-more-than-32705-threads/5751755#5751755). Read up on thread pools. – Sam Miller Apr 22 '11 at 03:16
  • @Sam I am curious as to what is causing the error. – Tom Apr 22 '11 at 03:20

3 Answers3

6

I think with 32K threads on the system, you should look at potential solutions other than how to have more threads. For example, a thread pool (Boost has some things for this).

Anyway, on your system, aren't PIDs limited to 32768 or some such value? You're going to run out sooner or later, may as well design the system to allow processing more items than the max number of threads, I'd think.

That said, look at /proc/sys/kernel/pid_max to see your max PID--and try increasing it. This may get you beyond 32K (but may also cause unexpected behavior with programs not designed for unusually large PIDs, so be cautious).

And then you may be limited by the stack space (as opposed to virtual memory space). You could try creating threads with smaller stacks if you like.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • Bingo! It seems thats the number I didn't think about... Its 32768. Is there a way to change it? (The algorithm becomes much more complicated if I have to organise my threads in a pool) – Tom Apr 22 '11 at 02:38
  • Sure, just echo a different number into the file. Sometimes it's convenient to do it like `echo 65536 | sudo tee /proc/sys/kernel/pid_max`. – John Zwinck Apr 22 '11 at 02:40
  • 3
    Just don't say I didn't warn you when having thousands of threads comes back to haunt you. – John Zwinck Apr 22 '11 at 02:41
  • 1
    @Tom: Your design is simply a bad approach, that's way too many threads. – GManNickG Apr 22 '11 at 02:41
  • @John hehe, I have been warned. It didn't seem to do the job though, do I need to restart to get these numbers re-read by my system? – Tom Apr 22 '11 at 02:44
  • Maybe. But if you restart, the values will probably end up as they were, unless you add them to your sysctl.conf or similar so that they are made permanent. – John Zwinck Apr 22 '11 at 02:46
  • Agreed - one thread per node sounds excessive, and may well slow down processing rather than speed it up. A thread pool is a great idea, although I'd even suggest rethinking the whole threading approach. Another consideration for the OP is that even if they only hold two pointers, each thread is consuming 8MB for the stack. Even if you won't theoretically run out of memory on a 64-bit system, there are still ways of running out of resources. – gavinb Apr 22 '11 at 02:46
  • @GMan: I use the threads because I want the boost::condition. (a message Waits until the data is available at a node and then retreives it). Is there a different pattern that will give me the same kind of thing? @gavinb True, but I am not worried about the speed too much, the waiting on a condition makes the algorithm trivial to write. – Tom Apr 22 '11 at 02:47
  • 1
    The stack space issue can be addressed by reducing the size of each thread's stack. With enough band-aids we can touch the sky! – John Zwinck Apr 22 '11 at 02:48
  • 2
    @Tom: I understand why you're doing it, I'm saying your reason isn't a good reason. – GManNickG Apr 22 '11 at 02:55
  • @GMan, @John: Okay, i probably need a rewrite. I'm curious though which limit I am hitting. I thought it was the PID but I changed it and still the same limit. Could it be something else? – Tom Apr 22 '11 at 02:59
  • 2
    Sure, it might be something else. I wouldn't be surprised if there were several things that needed changing to make a system run freaky-high numbers of processes/threads. – John Zwinck Apr 22 '11 at 03:19
0

Okay, to answer the question: you need to increase

/proc/sys/vm/max_map_count

As discussed here:

https://listman.redhat.com/archives/phil-list/2003-August/msg00025.html

and here:

http://www.kegel.com/c10k.html#limits.threads

HOWEVER: FOR BETTER WAYS TO DO THIS LOOK AT THE FOLLOW UP QUESTION:

Non-threaded alternative to waiting on a condition. (Edit: Proactor pattern with boost.asio?)

Community
  • 1
  • 1
Tom
  • 5,219
  • 2
  • 29
  • 45
  • Mark Russinovich did a series on various Windows limits and one of them included threads and processes (http://blogs.technet.com/b/markrussinovich/archive/2009/07/08/3261309.aspx). Please, don't do this. First of all, how in the hell are you going to debug it? But more importantly, you are highly likely to be murdered by the first person who has to maintain your code. – Luke Apr 22 '11 at 11:06
  • @Luke - Yes, I am switching the pattern now -> http://stackoverflow.com/questions/5754228/non-threaded-alternative-to-waiting-on-a-condition - edited the answer – Tom Apr 22 '11 at 11:18
0

It really depends on how big your stacks are, but you're going to run out of address-space (32-bit) or virtual memory (64-bit) if you create a lot of threads.

In Linux pthreads the default stack size was 10Mb last time I checked; this means that 32k threads uses 320G of address space (note it will probably be lazily initialised, so it won't use that much virtual memory); this is probably too much.

Even if you make the stack quite small and don't exhaust the memory this way, 32k threads is going to use a lot of virtual memory for stacks. Consider using a different approach.

ulimit only affects the stack size of the initial thread (which is dynamic normally under Linux); other threads' stack size is fixed and set at thread creation time by the pthread library.

MarkR
  • 62,604
  • 14
  • 116
  • 151