2

What is the most efficient way to get the same attribute of multiple elements using Python, Selenium and PhantomJS? My solution uses find_elements_by_css_selector which locates all the elements I need, which takes less than a second, then I loop through the list to get the attribute I require. My looping takes over a minute with around 2500 elements, which seems like a lot to me considering all the elements are mapped with find_elements_by_css_selector method. Is get_attribute method really that expensive or am I doing something wrong?

from selenium import webdriver

driver = webdriver.PhantomJS(executable_path=r'mypath\phantomjs.exe')
driver.set_window_size(1120, 550)
driver.get("https://www.something.com")

table = []
elements = driver.find_elements_by_css_selector("tr[id*='bet-']") # takes under 1 second

for element in elements:
   table.append(element.get_attribute('data-info')) # takes over 60 seconds (2000 elements)

driver.close
Gorionovic
  • 185
  • 2
  • 9
  • You might get little acceleration by using `list comprehension` instead of `for` loop: `table = [element.get_attribute('data-info') for element in driver.find_elements_by_css_selector("tr[id*='bet-']")]` – Andersson Mar 27 '17 at 13:43
  • 1
    attributes are not present as part of object property and so it's like having 2000 separate calls to webdriver. and if that takes 60 seconds I Would say it's pretty fast. – Gaurang Shah Mar 27 '17 at 13:46
  • Do all the elements located with your CSS selector have the attribute you want or only some of them? If only some of them do, you can add to your CSS selector to make sure all of them do before looping, e.g. "tr[id*='bet-'][data-info]". – JeffC Mar 27 '17 at 20:05

1 Answers1

6

The problem is, every .get_attribute() selenium command is a JSON HTTP wire request and, it, of course, introduces a lot of overhead.

There is no direct way to do "batch get attribute" for multiple elements.

The closest thing you can probably do is to get the attributes via JavaScript, issuing execute_script(), which is a single JSON HTTP command:

attributes = driver.execute_script("""
    var result = []; 
    var all = document.querySelectorAll("tr[id*='bet-']"); 
    for (var i=0, max=all.length; i < max; i++) { 
        result.push(all[i].getAttribute('data-info')); 
    } 
    return result;
""")

One downside of this approach is that element attribute retrieval logic in this case is not based on the webdriver API specification - this may potentially result in inconsistent results if you are following both selenium- and js-based approaches in your codebase.

Some related topics:

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195