0

I have to scrape all the event details from the AXS.com website as a part on my webscraping assignment. I have tried using chrome web driver with Python+Selenium.

I am able to get the value by using driver.find_element_by_class_name() e.g. driver.find_element_by_class_name("headliner").text.

But this get only the first item. I got stuck by while I tried iterating after using driver.find_elements(By.XPATH,"//div[@class='results-table results-table--events']").

from bs4 import BeautifulSoup
from selenium import webdriver
import time
driver = webdriver.Chrome('/home/.../chromedriver_linux64/chromedriver')
driver.get("https://www.axs.com/browse/music/alternative-punk")
driver.implicitly_wait(10)
allevent_details = driver.find_elements(By.XPATH,"//div[@class='results-table results-table--events']")     
for i in allevent_details:
    print(i.find_element_by_class_name("headliner").text)

Error

NoSuchElementException: no such element: Unable to locate element: {"method":"class name","selector":"headliner"}
(Session info: chrome=74.0.3729.169)
(Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Linux 4.15.0-50-generic x86_64)

Expected:

  • Inner Wave
  • BLOXX.... etc
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
RAMASWAMY M
  • 49
  • 1
  • 6

3 Answers3

0

Change the logic as below.

from bs4 import BeautifulSoup
from selenium import webdriver
import time
driver = webdriver.Chrome('/home/.../chromedriver_linux64/chromedriver')
driver.get("https://www.axs.com/browse/music/alternative-punk")
driver.implicitly_wait(10)
allevent_details = driver.find_elements(By.XPATH,"//div[@class='results-table results-table--events']//div[@class='headliner']")     
for i in allevent_details:
    print(i.text)
supputuri
  • 13,644
  • 2
  • 21
  • 39
0

Try any of following locator.

Using Xpath

allevent_details = driver.find_elements(By.XPATH,"//div[@class='results-table results-table--events']")
for i in allevent_details:
     print(i.find_element_by_xpath(".//div[@class='headliner']").text)

Using Css Selector

for item in driver.find_elements_by_css_selector('.headliner'):
    print(item.text)
KunduK
  • 32,888
  • 5
  • 17
  • 41
0

To extract all the event headlines from the webpage you need to induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR:

    print([my_elem.text for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.headliner")))])
    
  • Using XPATH:

    print([my_elem.text for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='headliner']")))])
    
  • Console Output:

    ['Inner Wave', 'BLOXX, Hembree and Warbly Jets', 'Frenship', 'LANY', 'together PANGEA & Vundabar', 'Night Beats', 'New Politics', 'The Technicolors', 'Davila 666', 'Vansire + BOYO', 'The Starting Line', 'Katzù Oso', 'The Raconteurs', 'Cayucas', 'ALT 98.7 Summer Camp']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352