0

I have a question regarding how the selenium web driver works while multithreading in a Python script on an AWS EC2 Ubuntu server. Specifically, I'm concerned about the memory usage of each thread that is running a headless selenium browser.

  1. Every time my python script runs, I have around 75 threads running simultaneously, each running a headless selenium browser. I don't close the browsers by driver.close() command, does that mean the thread is still 'active' and using RAM?

  2. Let's say I run the script once. I see the logs and I see logs like 'Thread1, Thread 2 ..... Thread 75'. The script finishes and I run it again. Instead of seeing 'Thread1, Thread 3, ..... Thread 75', I see 'Thread 76, Thread 77, until Thread 150'. Does that mean by past threads that were never closed and are using memory? Basically, do they accumulate over time affecting RAM?

I'm just afraid of capping out on memory since I need to scale the app efficiently.

alex
  • 118
  • 1
  • 6

1 Answers1

0

In the first run if you are seeing see logs like Thread 1, Thread 2 ..... Thread 75 and in the second run if you are seeing Thread 76, Thread 77, until Thread 150, apparently it seems that the past threads were never closed and still continues to consume memory. Threads occupying memory over longer duration may trigger the OOM Killer for chrome

An ideal approach is to always invoke driver.quit() within tearDown(){} method to close & destroy the WebDriver and Web Client instances gracefully at the end of each run.


References

You can find a couple of relevant detailed discussion in:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Lovely. Thanks for the info. I'm still curious to know what you mean by using the `tearDown(){}` method. I've never heard of/used it. Would just quitting the driver work just as well? – alex Feb 25 '23 at 18:15
  • _`tearDown(){}`_ a default method in _Python Unittext_ framework :) yeap _quitting the driver_ would suffice. – undetected Selenium Feb 25 '23 at 18:19
  • Gotchu. Would you know if there is any official way to double check 'if the past thread are closed or not'? Like maybe checking all the Processes or something? So we can find out if they're impacting the RAM? I ask this because I just tried driver.quit() method but apparently despite that the threads keep increasing in number. – alex Feb 25 '23 at 18:25
  • Checkout the updated answer with the reference discussions. – undetected Selenium Feb 25 '23 at 18:29
  • I researched both the links. I think this is what's going on. Every now and then I see my multi threads in console saying `Thread 'Thread-135 (get_links)': missing ScriptRunContext`, which I'm assuming is a type fo exception. So I put the driver.quit() statement in the except block with get(URL) in try: block. I also made sure to demonize my threads. However, the number of threads are still increasing in number. – alex Feb 25 '23 at 19:21
  • I looked at this :https://stackoverflow.com/a/38536238/19434920 – alex Feb 25 '23 at 19:22