5

So I know I can increase the number of threads of a process in Linux using setrlimit and friends. According to this, the theoretical limit on the number of threads is determined by memory (somewhere around 100,000k). For my use I'm looking into using the FIFO scheduler in a cooperative style, so spurious context switches aren't a concern. I know I can limit the number of active threads to the number of cores. My question is what the pratical limit on the number of threads are, after which assumptions in the scheduler start being voilated. If I maintain a true cooperative style are additional threads "free"? Any case studied or actual examples would be especially interesting.

The Apache server seems to be the most analagous program to this situation. Does anybody have any numbers related to how many threads they've seen Apache spawn before becoming useless?

Related, but has to do with Windows, pre-emptive code.

Community
  • 1
  • 1
tgoodhart
  • 3,111
  • 26
  • 37
  • 1
    you may find the following article interesting, http://drdobbs.com/open-source/184406204. In it, there is a vague claim about 1M concurrent threads on a high end machine. – Kevin Nov 07 '11 at 20:43
  • @Kevin That gives me hope. What's especially interesting is the fact that the test was done in support of the O(1) scheduler, which has since been replaced. – tgoodhart Nov 07 '11 at 20:48
  • @Kevin Paper in question: http://www.redhat.com/whitepapers/.../POSIX_Linux_Threading.pdf – tgoodhart Nov 07 '11 at 20:56

1 Answers1

2

I believe the number of threads is limited

  1. by the available memory (each thread need at least several pages, and often many of them, notably for its stack and thread local storage). See the pthread_attr_setstacksize function to tune that. Threads stack space of a megabyte each are not uncommon.

  2. At least on Linux (NPTL i.e. current Glibc) and other systems where users threads are the same as kernel threads, but the number of tasks the kernel can schedule.

I would guess that on most Linux systems, the second limitation is stronger than the first. Kernel threads (on Linux) are created thru the clone(2) Linux system call. In old Unix or Linux kernels, the number of tasks was hardwired. It is probably tunable today, but I guess it is in the many thousands, not the millions!

And you should consider coding in the Go language, its goroutines are the feather-light threads you are dreaming of.

If you want many cooperative threads, you could look into Chicken Scheme implementation tricks.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • Ah Go, wouldn't that be nice. Lua has first class coroutines, which would also make this easier. The real thing I want is a clean stack frame for debugging. Continuation passing or message passing are great, but they tend to erase the sequence of events or at least make debugging that sequence difficult. If I can use lots of threads that will be far easier than using getcontext/makecontext and writing a poor man's cooperative threading library. – tgoodhart Nov 07 '11 at 20:40
  • I don't understand why you want a clean stack frame (why does it help debugging? Clearing every local is enough? And Java does that automatically). – Basile Starynkevitch Nov 07 '11 at 20:44
  • Poor choice of words. A "clean" stack frame in the sense that the stack will tell me precisely how I got to a given point in the code, in contrast to a work queue, which will only tell how I got to a particular piece of code since it was popped from the queue and may not give any indication of where it was pushed, especially if it can be pushed from multiple sites. – tgoodhart Nov 07 '11 at 20:52