0

What I am currently trying to do is the following. There are a number of changing values (js driven) in a website that I am monitoring and saving to a database using Selenium. The values are read through infinite loops, from elements found with selenium's find_element.

This works as intended with one process. However, when I try to multiprocess this (to monitor multiple values at the same time), there seems to be no way to do it without opening one separate browser for each process (unfeasible, since we are talking about close to 60 different elements).

The browser I open before multiprocessing seems to not be available from within the various processes. Even if I find the elements before the multiprocessing step, I cannot pass them to the process since the webelements can't be pickled.

Am I doing something wrong, is selenium not the tool for the job, or is there another way?

The code below doesn't actually work, it's just meant to show the structure of what I currently have as a "working version". What I need to get away from is opening the browser from within the function and have all my processes relying on a single browser.

import time
import datetime
import os
from selenium import webdriver
from multiprocessing import Pool

def sampling(value_ID):
    dir = os.path.dirname(__file__)

    driver = webdriver.Firefox(dir)
    driver.get("https:\\website.org")

    monitored_value = driver.find_element_by_xpath('value_ID')

    while(1):
            print(monitored_value.text)
            time.sleep(0.1)

value_array = [1,2,3,4,5,6]

if __name__ == '__main__':
    with Pool(6) as p:
        p.map(getSampleRT, value_array)
Paul Floyd
  • 5,530
  • 5
  • 29
  • 43
Varso
  • 11
  • 1
  • 1
    have you tried scrapy ? – jaibalaji Jun 03 '20 at 02:05
  • Have you tried multithreading instead? At least you would be able to pass the elements. See [this](https://stackoverflow.com/questions/30808606/can-selenium-use-multi-threading-in-one-browser) or do some more searching, e.g. [this example](https://stackoverflow.com/questions/53976689/selenium-threads-how-to-run-multi-threaded-browser-with-proxy-python) – Pynchia Jun 03 '20 at 02:52
  • Multithreading ended up working fine, thanks! – Varso Jun 05 '20 at 00:34

1 Answers1

0

You can checkout selenium abstract listeners if you want to capture the changes in elements. By implementing a listener you can get rid of infinite loops. Here is an example that i think it can work for you.

class EventListeners(AbstractEventListener):
    def before_change_value_of(self, element, driver):
        # check if this is the element you are looking for
        # do your stuff
        print("element changed!")

driver_with_listeners = EventFiringWebDriver(driver, EventListeners()
# wait as much as you like while your listeners are working
driver_with_listeners.implicitly_wait(20000)

Also you can checkout this post for more complete implementation.

Doruk Eren Aktaş
  • 2,121
  • 8
  • 23