33

Note: I "forayed" into the land of multiprocessing 2 days ago. So my understanding is very basic.

I am writing and application for uploads to amazon s3 buckets. In case the file size is larger(100mb), Ive implemented parallel uploads using pool from the multiprocessing module. I am using a machine with core i7 , i had a cpu_count of 8. I was under the impression that if i do pool = Pool(process = 6) I use 6 cores and the file begins to upload in parts and the uploads for the first 6 parts begins simultaneously. To see what happens when the process is greater than the cpu_count , i entered 20 (implying that i want to use 20 cores). To my surprise instead of getting a block of errors the program began to upload 20 parts simultaneously (I used a smaller chunk size to make sure there are plenty of parts). I dont understand this behavior. I have only 8 cores, so how cant he program accept an input of 20? When I say process=6, does it actually use 6 threads?? Which can be the only explanation of 20 being a valid input as there can be 1000s of threads. Can someone please explain this to me.

Edit:

I 'borrowed' the code from here. I have changed it only slightly wherein I ask the user for a core usage for his choice instead of setting parallel_processes to 4

letsc
  • 2,515
  • 5
  • 35
  • 54
  • 1
    You mix threads, processes and cores. They are all very different "things". if you set `processes=6`, it will just use 6 **processes**, which may run each on one core, or they may run all on one core - it depends on OS and your system load. As for the "question" - please provide some code. – Jan Spurny Mar 17 '15 at 00:25

2 Answers2

36

The number of processes running concurrently on your computer is not limited by the number of cores. In fact you probably have hundreds of programs running right now on your computer - each with its own process. To make it work the OS assigns one of your 8 processors to each process or thread only temporarily - at some point it may get stopped and another process will take its place. See What is the difference between concurrent programming and parallel programming? if you want to find out more.

Edit: Assigning more processes in your uploading example may or may not make sense. Reading from disk and sending over the network is normally a blocking operation in python. A process that waits for its chunk of data to be read or sent can be halted so that another process may start its IO. On the other hand, with too many processes either file I/O or network I/O will become a bottleneck and your program will slow down because of the additional overhead needed for process switching.

Community
  • 1
  • 1
Pyetras
  • 1,492
  • 16
  • 21
  • when uploading a file of over `1 Gb`, the time taken drops from `~10 minutes` (when using serial uploads) to `6 minutes` (when i use parallel uploads) – letsc Mar 17 '15 at 00:48
0

Experiment Demonstrating the optimized CPU Count - Image

Adding supporting data points to @Pyetras Answer. if someone is looking for the optimized/right CPU count for their parallel programming, than it depends on various factor including a) the actual tasks performed in the code b) OS c) Disk/IO Related requirements.

However, a general rule of thumb is to explore/use CPU count between the actual CPU count & 2 times that of actual..in my case, the actual CPU count is 32 and using CPU count between 32 & 64 gives me the best performance...

When going beyond 64, the execution time increased as OS is spending more time on managing the process execution than actual doing the task

When used below 32, i'm not using the true power of my CPU and hence the performance decreased.

Veeramani Natarajan
  • 192
  • 2
  • 4
  • 11