simple < multithreading < multiprocessing in Python, speed up of the process

Question

With respect to this amazing answers and this blogpost I still have a small question. Namely, the blog post states from the benchmarking that threading will be slower than without threading due to the GIL:

simple    threading   multiprocessing  
threads = 2       4.124      5.539       2.034 
threads = 3       6.391      13.772      3.376  
threads = 4       9.194      17.641      4.720

So threading is even slower than simple execution. This is understood from the behaviour of GIL discussed above and should not surprise us now.

I benchmarked my own function (scrapping the data and writing it to file) in the same manner as in the post. And I have following results:

simple 15 mins, threading: 10 mins, multiprocessing 5 mins.

So, why can threading be faster than simple method without any threading?

EDIT: Small Description of functions

for thread in range(4):
    process = multiprocessing .Process(name=str(thread), target=perform_extraction, args=(ranging[thread],))
   #process = Thread(name=str(thread), target=perform_extraction, args=(ranging[thread],))
    process.start()
    processes.append(process)

for process in processes:
    process.join()

def perform_extraction(ranges):
     thread_name = multiprocessing.current_process().name
     #thread_name = currentThread().getName()
     for page in ranges:
        data = extract_data(page)
        write_data(data, thread_name+'.txt')

Did you run the tests multiple times? How does your function look like? — CristiFati, Nov 22 '17 at 14:08
Yes, several times. I have a function that extracts the data and writes it into the file. For each process/thread I have a specific name and a specific range of pages and each tread/process extracts only the specified pages and writes in the file with only corresponding name. So, 4 different threads/processes get 4 different page ranges and will write into 4 different files. — Alina, Nov 22 '17 at 14:10
Try elliminating the _I/O_ (writing to the file part) and see what happens. — CristiFati, Nov 22 '17 at 14:14
@CristiFati The time is almost the same, please check the edit in the question as I changed how the function looks like. — Alina, Nov 22 '17 at 16:52

simple < multithreading < multiprocessing in Python, speed up of the process

0 Answers0