0

So I have some simple code that is using multithreading. It's working just fine however I'm noticing that the threads aren't being destroyed after returning the values since everytime the script runs, the Thread number in console goes up and the RAM being used also goes up after the script is done processing (which implies that something was left running after the script was done processing).

After researching this, this, this and this, I've noticed that my threads are probably aren't joining (?) since my script never prints "Threads Destroyed". Can anyone suggest what could be going wrong?

if __name__ == "__main__":
def run_selenium1(a, b, c, d, e):
    
    @st.cache_data(show_spinner=False)
    def get_links(i, resumeContent):
        #stufff happens
            for something1, something2, something3, something4, something5, something6, something7 in zip(Final_Something1, Final_Something2, Final_Something3, Final_Something4, Final_Something5, Final_Something6, Final_Something7):
                Final_Array.append((something1, something2, something3, something4, something5, something6, something7))
            driver.close()
            driver.quit()
        except:
            driver.close()
            driver.quit()


    with webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options) as driver:
        try:
           #links are obtained
        except:
            driver.close()
            driver.quit()

    threads = []
    for i in links:
        t = threading.Thread(target=get_links, args=(i, Content))
        t.daemon = True
        threads.append(t)
        t.start()
    for t in threads:
        t.join()
        print("Threads destroyed") #<---- this isn't printed

EDit: After Eureka's answer I get this:

    Starting thread 0

Starting thread 1

Starting thread 2

Starting thread 3

Starting thread 4

Starting thread 5

Starting thread 6

Starting thread 7

Starting thread 8

Starting thread 9

Starting thread 10

Starting thread 11

Starting thread 12

Starting thread 13

Starting thread 14

Starting thread 15

Starting thread 16

Starting thread 17

Starting thread 18

Starting thread 19

Starting thread 20

Starting thread 21

Starting thread 22

Starting thread 23

Starting thread 24

Total number of threads was 25

Trying to Join thread # 0

Joined thread # 0

Trying to Join thread # 1

Joined thread # 1

Trying to Join thread # 2

Joined thread # 2

Trying to Join thread # 3

Joined thread # 3

Trying to Join thread # 4

Joined thread # 4

Trying to Join thread # 5

Joined thread # 5

Trying to Join thread # 6

Joined thread # 6

Trying to Join thread # 7

Joined thread # 7

Trying to Join thread # 8

Joined thread # 8

Trying to Join thread # 9

Joined thread # 9

Trying to Join thread # 10

Joined thread # 10

Trying to Join thread # 11

Joined thread # 11

Trying to Join thread # 12

Joined thread # 12

Trying to Join thread # 13

Joined thread # 13

Trying to Join thread # 14

Joined thread # 14

Trying to Join thread # 15

Joined thread # 15

Trying to Join thread # 16

Joined thread # 16

Trying to Join thread # 17

Joined thread # 17

Trying to Join thread # 18

Joined thread # 18

Trying to Join thread # 19

Joined thread # 19

Trying to Join thread # 20

Joined thread # 20

Trying to Join thread # 21

Joined thread # 21

Trying to Join thread # 22

Joined thread # 22

Trying to Join thread # 23

Joined thread # 23

Trying to Join thread # 24

Joined thread # 24

All threads have now been joined

                                                                                                                                                    
alex
  • 118
  • 1
  • 6
  • Why do you set the daemon flag on threads that your main thread `join()`s? It creates the illusion that you aren't quite sure how the threads are supposed to terminate. – Solomon Slow Feb 26 '23 at 02:38
  • @SolomonSlow Not sure what you mean by that. Are you saying I shouldn't demonize that? I did that so they are destroyed once the main thread finishes. I might've misunderstood how deamonzing works :/ – alex Feb 26 '23 at 02:48
  • The only reason for setting `t.daemon=True` is to let the thread be automatically killed at the end of the program. (Where, "end of the program" means the death of the last non-daemon thread.) But any thread that you `join()` from the main thread cannot live long enough to be automatically killed because it has to die before the `join()` call can return, and the automatic killing of daemon threads cannot happen before the main thread ends. – Solomon Slow Feb 26 '23 at 03:06
  • @SolomonSlow Gotchu. so are you suggesting removing `daemon` will let the threds die and hence help in not persisting in memory? – alex Feb 26 '23 at 03:08
  • I'm suggesting that you remove `t.daemon=True` because it serves no purpose in your program. Removing it will _not_ change how your program behaves. As a rule of thumb, removing lines of code that serve no purpose in a program _always_ is a good idea. It makes your program that much smaller (that much easier to read and understand,) and it biases other programmers toward belief that you understand what you are doing. – Solomon Slow Feb 26 '23 at 03:13

1 Answers1

1

Perhaps it is not printing "Threads destroyed" because the (first) thread is not finishing?

To test this, try adding notifications that the threads are finishing:

if __name__ == "__main__":
def run_selenium1(a, b, c, d, e):
    
    @st.cache_data(show_spinner=False)
    def get_links(iterator, i, resumeContent):
        #stufff happens
            for something1, something2, something3, something4, something5, something6, something7 in zip(Final_Something1, Final_Something2, Final_Something3, Final_Something4, Final_Something5, Final_Something6, Final_Something7):
                Final_Array.append((something1, something2, something3, something4, something5, something6, something7))
            print("About to close ",iterator)
            driver.close()
            driver.quit()
            print("Closed and quit ",iterator)
        except:
            print("Error on ",iterator)
            driver.close()
            driver.quit()
            print("Error closed and quit ",iterator)


    with webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options) as driver:
        try:
           #links are obtained
        except:
            driver.close()
            driver.quit()

    threads = []
    for iterator,i in enumerate(links):
        t = threading.Thread(target=get_links, args=(iterator,i, Content))
        t.daemon = True
        threads.append(t)
        print("Starting thread",iterator)
        t.start()

    print("Total number of threads was ",len(threads)

    for i,t in enumerate(threads):
        print("Trying to Join thread #",i)
        t.join()
        print("Joined thread #",i) 

    print("All threads have now been joined")

    threads = []
    t = None
ProfDFrancis
  • 8,816
  • 1
  • 17
  • 26
  • I updated the question with detaisl on what happened after I tried your suggestion. – alex Feb 25 '23 at 22:10
  • I'm not sure if the first thread is being finished or not, but the RAM still went up so I'm assuming the threads are still accumulating :/ – alex Feb 25 '23 at 22:12
  • Sorry I kept thinking that "i" was an iterator or index variable, so I caused you to print out some difficult-to-understand messages. Can you please try with the revised messages? They will make it clearer how many threads are created, when they end, and which one is being joined. – ProfDFrancis Feb 25 '23 at 22:20
  • You're good. Attempting right now. – alex Feb 25 '23 at 22:21
  • I tried the new code and I'm getting a similar log. It doesn't `print("All threads have now been joined")` or `print("Total number of threads was ",len(threads)` at all :/ – alex Feb 25 '23 at 22:34
  • I'm seeing the memory go up so we can sure that the threads are lingering around. And the console is a witness to the fact that the total number of threads are going up with every subsequent script run. Any idea how we can just clear the old threads so RAM isn't affected? That's the only thing I'm worries about honeslty – alex Feb 25 '23 at 22:35
  • How is it possible for `print("Total number of threads was ",len(threads)` to not run, and yet it get to "Trying to join..."? Can you paste the log you are getting now? It should now show iterator numbers for each thread, instead of URLs, which should be easier to interpret. – ProfDFrancis Feb 25 '23 at 23:03
  • I was wondering the same thing. I'm also in tmux which is making it hard to go through everything properly. I'm trying again right now – alex Feb 25 '23 at 23:06
  • One final long shot. I am just wondering whether retaining a reference to the thread somehow stops it releasing its memory? To eliminate that possibility, try clearing the `threads` array at the end. I have added `threads = []` to the end of the code above. – ProfDFrancis Feb 25 '23 at 23:13
  • Gotchu. I just tried it properly and I just put in the question what I get from it. What do you think is happening? I tried this with the new `threads = []` snippet btw. – alex Feb 25 '23 at 23:17
  • That is strange. The threads are all joined, but you never saw a message that any thread was _about to close_ or had _closed_? Are you sure those messages are in your code? – ProfDFrancis Feb 26 '23 at 00:07
  • I do yeah. I just tried several times again but they're not printing. I'm still seeing increase in memory after trying out the threads =[] method btw. Any idea what could be going wrong? – alex Feb 26 '23 at 00:25
  • Yes, add extra messages in the "#stufff happens" section, to see how far the code gets, before hanging. – ProfDFrancis Feb 26 '23 at 00:30
  • 1
    Just tried it by correcting everything. It actually goes through all of them. Starts with `Starting thread 0` till 24. Then tries to close all of them and joins all of them and then in the end says `Joined thread # 24` and `All threads have now been joined`. So the code is working as expected but the number of threads are still going up and the old ones aren't clearing. The RAM is also going up. I feel like it's something else that's wrong. What are your thoughts? – alex Feb 26 '23 at 03:03
  • So I just found out something. When I use `threading.enumerate()` to list the number of active threads, I don't actually see a lot of threads. I do see some other threads. Do you know how I can kill those 6 threads that are listed? What's interesting is that none of the threads listed in `threading.enumerate()` are from multithreading, they were started before multithreading is even kicked off. – alex Feb 26 '23 at 20:53