1

Context: I mean to multi-thread several browsers instances and execute processes in them.

Reason for question: I would like to know what is the most efficient/less consuming way to check for element in python selenium. I have tried two methods which i'll show below, and a little understanding of mine about each of them.

First of all, this is my the function which returns the driver instance:

def open_driver():
    chrome_options = webdriver.ChromeOptions()
    prefs = {"profile.default_conte nt_setting_values.notifications" : 2}
    chrome_options.add_experimental_option("prefs", prefs)
    chrome_options.add_argument("start-maximized")
    chrome_options.add_argument('ignore-certificate-errors')
    chrome_options.add_experimental_option('excludeSwitches', ['enable-logging'])

    capa = DesiredCapabilities.CHROME
    capa["pageLoadStrategy"] = "none"

    driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=chrome, desired_capabilities=capa)

    return driver

Note this particular line:

capa = DesiredCapabilities.CHROME
capa["pageLoadStrategy"] = "none"

From my understanding, this will tell selenium not to wait for the dom to completely load. This is a tradeoff in performance which I had to choose, because this particular page would sometimes get stuck endlessly in document.readyState == interactive

So I have basically two options that I know of in checking if element exists (I'd appreciate suggestions too), which are:

  • WebDriverWait(self.driver,self.timeout).until(EC.presence_of_element_located((By.XPATH, element))) which returns a WebElement. Two things about this line:

    • I think its not respecting the self.timeout time due to capa["pageLoadStrategy"] = "none" but I'm not sure

    • Its very unstable, sometimes it runs fast, sometimes very slow.

  • driver.execute_script("document.getElementsByClassName('alert alert-danger ng-binding ng-scope')[0].innerText")

This inside a try: except: approach seems to be very much faster in execution compared to the one above, but it seems to overload the browser and then the execution displays errors (error when fetching data from the server) more often

With this being said, and reiterating that I'm new to this, I thank you for taking your time in reading my question.

PS: I'm all in for suggestions, improvements and specially corrections.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • why must it be in Selenium? If you're optimizing for speed over "browser accuracy" (not even waiting for dom to load), then you could just parse the raw HTTP without browser automation at all? – Kache Feb 03 '22 at 20:30
  • " this particular page would sometimes get stuck endlessly " I don't think the PageLoadStrategy setting will help you at all if this happens. I'm pretty sure Selenium won't be able to interact at all with that page's DOM while it is endlessly loading. Won't matter what the pageloadstrategy is set to. (Though pageloadstrategy of "none" will help you to navigate away to another page... which is really the only thing you can do with such a page.) – pcalkins Feb 03 '22 at 20:37
  • Could you please add a few details regarding the context of your issue? What do you mean under efficiency and consulting? time, memory, cpu? And also about checking the element presence. How many elements would you like to check? What are you going to do with the elements later? I'm asking this because both approaches, you've mentioned looks pretty similar. Expected conditions is the default approach and its hard to optimise it. You have to perform thousands of operations to detect any difference.. – Max Daroshchanka Feb 03 '22 at 20:40
  • btw, I tend to prefer using "visibilityOfAllElements" expected condition in the wait. That'll return an array of webelements. If that array is 0 size, the webelement is not found. The fastest way will be just findElements()... but that will only work on a site that doesn't use javascript to populate the DOM, or after a .get()... and will also only work if pageloadstrategy is eager or normal. – pcalkins Feb 03 '22 at 20:41

1 Answers1

1

Addressing your concern:


Conclusion

particular page would sometimes get stuck endlessly would be relatively easier to address. However, bypassing

document.readyState == interactive

and favouring:

capa["pageLoadStrategy"] = "none"

and configuring Selenium WebDriver not to wait for the dom to completely load is not only a tradeoff in performance but also a barrier where you are forced to use presence_of_element_located() instead of visibility_of_element_located() and induce chaos and instability.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352