0

I would like to ask for help. I try on website https://www.kununu.com/de/volkswagen/kommentare/100 scrape overall rating under main title of all articles but when I do it, it will print:

4,8
4,8
4,8
4,8
4,8
4,8
4,8
4,8
4,8
4,8
4,8

But there are more ratings not just 4,8. So I want to find element in elements loop. I would like to do it in this type of loop if it's possible. Here is my code:

art = driver.find_elements_by_xpath("//article[@class='index__contentBlock__7vKo-']")
    for i in art:
        pr = i.find_element_by_xpath("//span[@class='index__score__16yy9']").text
        print(pr)
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
martinkub
  • 41
  • 1
  • 6
  • For documentation purposes, accept the answer that helped you the most or provide feedback as to what do you expect. – Marios Sep 02 '20 at 11:54

3 Answers3

0

You've already gathered all the elements in art.

All you have to do is:

art = driver.find_elements_by_xpath("//article[@class='index__contentBlock__7vKo-']")
for i in art:
    print(i.text)

Let me know if that works.

Kyle
  • 19
  • 4
  • but I want to print just i.find_element_by_xpath("//span[@class='index__score__16yy9']").text not whole text – martinkub Aug 27 '20 at 22:29
0

This should print all articles with index_score.

art = driver.find_elements_by_xpath("//article[@class='index__contentBlock__7vKo-']//span[@class='index__score__16yy9']")

for i in art:
    print(i.text)
Arundeep Chohan
  • 9,779
  • 5
  • 15
  • 32
0

To extract the ratings e.g. 2,0 using Selenium and you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and get_attribute("innerHTML"):

    driver.get('https://www.kununu.com/de/volkswagen/kommentare/100')
    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div[class^='index__ratingBlock'] span[class^='index__score__']")))])
    
  • Using XPATH and text attribute:

    driver.get('https://www.kununu.com/de/volkswagen/kommentare/100')
    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[starts-with(@class, 'index__ratingBlock')]//span[starts-with(@class, 'index__score__')]")))])
    
  • Console Output:

    ['2,0', '4,5', '3,8', '4,8', '2,8', '4,7', '3,2', '4,0', '4,9', '4,2']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

Outro

Link to useful documentation:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352