python selenium chrome driver processes exceed system pid max, how to kill them?

Question

I'm using selenium chrom webdriver to crawl webpages one by one, and each time initialize a driver instance with closing operation after crawling finished. After trying several times, there are almost 10000 chrome processes in system, and can't be killed by kill command. How to handle this problem? Thanks~

code as follows:

@classmethod
def get_content_by_selenium(cls, url):
        content = ''
        chrome_options = webdriver.ChromeOptions()
        chrome_options.add_argument('--headless')
        chrome_options.add_argument('--start-maximized')
        chrome_options.add_argument('--disable-extensions')
        chrome_options.add_argument('--disable-infobars')
        chrome_options.add_argument('--disable-gpu')
        chrome_options.add_argument('--no-sandbox')
        chrome_options.add_argument('--no-proxy-server')
        chrome_options.add_argument('--disable-dev-shm-usage')
        try:
            cls.driver = webdriver.Chrome(options=chrome_options, executable_path='/home/chromedriver')  # Optional argument, if not specified will search path.
            cls.driver.set_page_load_timeout(30)
            cls.driver.get(url)
            html = cls.driver.page_source
            soup = BeautifulSoup(html, 'html.parser')
            for script in soup(["script", "style"]):
                script.extract()
            meta = cls.get_meta(soup)
            text = ' '.join(soup.text.split())
            content = ' '.join([meta, text])
        except Exception as e:
            print(e)
            print('webdriver failed, continue running')
        finally:
            if not cls.driver is None:
                cls.driver.quit()
            return content

What is cls.get_meta? Does the process open another browser? As you did use .quit(), I will delete my original answer. — Yuan, Aug 01 '19 at 03:42
First get_meta is another class method. The process I use will call get_content_by_selenium many times, so there are many chrome instances. As you see, I user quit() within finally, it will execute in any cases. — Alexander, Aug 01 '19 at 08:47
I cannot reproduce your error, the code on my computer quit the browser properly. — Yuan, Aug 01 '19 at 08:48
I do not think there would any adverse effects. I am running a similar project in my Linux server. Each time, when I run .quit(), all memory was released. I recommend you to add some print after `cls.driver.quit()` to assure the browser is really closed. There must be some error among your entire code. — Yuan, Aug 01 '19 at 11:03

python selenium chrome driver processes exceed system pid max, how to kill them?

0 Answers0