6

According to the documentation of ThreadPoolExecutor

If max_workers is None or not given, it will default to the number of processors on the machine.

If I don't set it a value like this

ThreadPoolExecutor(max_workers=None)

is it bad for performance in case that my value is very low (2) ? Will python already allocate all the CPU processes for None value vs allocate only 2 for value with a number?

Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
Dejell
  • 13,947
  • 40
  • 146
  • 229

1 Answers1

21

To begin with, you seem to be quoting the wrong part of the documentation in your link, namely the one for processes, not threads. The one for concurrent.futures.ThreadPoolExecutor states:

Changed in version 3.5: If max_workers is None or not given, it will default to the number of processors on the machine, multiplied by 5, assuming that ThreadPoolExecutor is often used to overlap I/O instead of CPU work and the number of workers should be higher than the number of workers for ProcessPoolExecutor.


Since you're using threads, not processes, the assumption is that your application is IO bound, not CPU bound, and that you're using this for concurrency, not parallelism. The more threads you use, the higher concurrency you'll achieve (up to a point), but the less CPU cycles you'll get (as there will be context switches). You have to instrument your application under typical workloads to see what works best for you. There is no universally optimal solution for this.

Community
  • 1
  • 1
Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
  • I am using python 2.7 - what would it be in this case? what I meant to ask, if is matters if I put limit or not, in case that I want the max workers I need (e.g. I call 4 methods) in any case – Dejell Nov 09 '16 at 12:32
  • @Dejell First, note that you linked to Python 3.5 docs. In any case, my point is that you need to try different values to see what works for you. Neither you, nor the authors of the Python standard library, can guess in advance what will work best for your case. If performance is important to you, I wouldn't rely on the default implied number. – Ami Tavory Nov 09 '16 at 12:37
  • 1
    Thanks. just what is this "default implied number" ? and maybe I am missing something - would python "allocate" memory for that default number, or only one I write execute.submit() ? – Dejell Nov 09 '16 at 12:39