Note: I "forayed" into the land of multiprocessing
2 days ago. So my understanding is very basic.
I am writing and application for uploads to amazon s3
buckets. In case the file size is larger(100mb
), Ive implemented parallel uploads using pool
from the multiprocessing
module. I am using a machine with core i7
, i had a cpu_count
of 8
. I was under the impression that if i do pool = Pool(process = 6)
I use 6
cores and the file begins to upload in parts and the uploads for the first 6 parts begins simultaneously. To see what happens when the process
is greater than the cpu_count
, i entered 20 (implying that i want to use 20 cores). To my surprise instead of getting a block of errors the program began to upload 20 parts simultaneously (I used a smaller chunk size
to make sure there are plenty of parts).
I dont understand this behavior. I have only 8
cores, so how cant he program accept an input of 20? When I say process=6
, does it actually use 6 threads?? Which can be the only explanation of 20 being a valid input as there can be 1000s of threads. Can someone please explain this to me.
Edit:
I 'borrowed' the code from here. I have changed it only slightly wherein I ask the user for a core usage for his choice instead of setting parallel_processes
to 4