Selenium's webdriver is not clearing memory after quit() with Parallel webdriver sessions

Question

I using selenium Firefox webdriver to do some scraping like below:

def scrape():
    options = webdriver.FirefoxOptions()
    options.add_argument('--headless')
    browser = webdriver.Firefox(options=options)
    browser.get(url) # some url

    # do complex reading, element clicking and moving to and from pages

    browser.quit()
    time.sleep(5)

The above works well if its just one process (no parallel workers). Firefox memory consumption is stable and is cleared periodically after loading a lot of data during scraping.

However once I run the above function with joblib Parallel function. There seems to be some memory leak:

Parallel(n_jobs=-1)(delayed(scrape) for i in links)

Adding time.sleep() after browser.quit() seems to help slightly but not too much. And I notices that the smaller the no. of parallel jobs the less severe this leak is.

I have added more time.sleep() functions throughout the code since it might be the case that the code is to computationally intensive and browser.quit() is not releasing memory. But I still seem to have this problem compared to when I have just a single job.

Ultimately I want to release all firefox memory before a new job starts another webdriver session. This does not seem to be happening. why is that? How can this be fixed?

The question is not answered by the suggested post. – bcsta Dec 01 '20 at 14:26 — bcsta, Dec 01 '20 at 14:26

Selenium's webdriver is not clearing memory after quit() with Parallel webdriver sessions

0 Answers0

Linked