0

I m new to programming, I have searched and looked at other SO posts regarding multithread, my title may be misleading but I'll try to explain in brief!

I want to understand how I keep passing 2 args from 2 different lists.

so for example.

If I have 2 lists.

l1 = ['a','b','c']
l2 = ['d','e','f']

And I want to loop through this list and pass those in a function but using multithreading.

normal for loop would be something like this..

def do_something(l1, l2):
    print(l1)
    print(l2)

for l1, l2 in zip(l11, l22):
  do_something(l11, l22)

But I want to use multithreading for this process, so it would print all both list value together...

so far I've tried this but its not working.

threads = [] 
for l1, l2 in zip(l11, l22):    
    threads.append(threading.Thread(target=do_something, args=(l11, l22)))  
    threads[-1].start()
    
for t in threads:                                                           
    t.join()

Can someone please explain to me how multi-threading works in a simple way? And How do I achieve the above results?

Thinking from a larger point of view if I m comparing multiple 2 huge excel files where I have to give column names for both excel in a function. How can I run the function parallelly so it wont wait for the first iteration to finish?

nTlearn
  • 15
  • 1
  • 4
  • 2
    You are resetting `threads` to an empty list each time the loop iterates. Put `threads = []` *before* the loop, not in it. – chepner May 16 '22 at 22:41
  • @chepner Aah, Rookie mistake. I fixed it and tried it's still doing one iteration per time. and it didn't even go for the other after completing first. – nTlearn May 16 '22 at 22:47
  • In Python, multithreading only helps with I/O-bound workloads. Unless you are waiting for I/O, the "global interpreter lock" means only one Python thread can execute at a time. The others will wait. – Tim Roberts May 16 '22 at 22:51
  • Generally, use threading for i/o bound tasks and multiprocessing for cpu bound tasks. [Multiprocessing vs Threading Python (duplicate)](https://stackoverflow.com/questions/3044580/multiprocessing-vs-threading-python), [What are the differences between the threading and multiprocessing modules?](https://stackoverflow.com/questions/18114285/what-are-the-differences-between-the-threading-and-multiprocessing-modules) – wwii May 16 '22 at 22:52
  • @TimRoberts Maybe I didn't understand what you said but I m not waiting for any I/O, I have the list ready from which I have to pass each value to a function. – nTlearn May 16 '22 at 23:01
  • 1
    Right. You don't have any I/O, so multithreading will not help you one bit. It will still take the same amount of time. – Tim Roberts May 16 '22 at 23:30

1 Answers1

1

Thinking from a larger point of view if I m comparing multiple 2 huge excel files where I have to give column names for both excel in a function. How can I run the function parallelly so it wont wait for the first iteration to finish?

Python does not benefit of threading in CPU-Bound operations. For that you would need Multiprocessing.

Beware that in windows it gets very complex very fast, and it is very slow to launch processes. To the point that unless it is taking multiple seconds to process it does not help. Besides ProcessPoolExecutor you also have Starmap from multiprocessing library.

Zaero Divide
  • 699
  • 2
  • 10
  • What If I m comparing multiple tables from 2 databases? – nTlearn May 16 '22 at 23:04
  • If you include the querying in the task, then it would definitely help. But _not_ if you split only the post-processing. If you are planning to launch multiple items, check [threadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor) as it recycles the threads, as workers. – Zaero Divide May 16 '22 at 23:09