got empty result in scraping a record

Question

i made a progam to scrap attributes from single record from web but i am getting nothing in my variables below is what i tried. I am unable to understand where my logic is wrong

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome(executable_path='chromedriver.exe')
url = "https://openlibrary.org/works/OL7960560W/Eyewitness?edition=ia%3Acowboy0000murd_y0x0"
global title
driver.get(url)
wait = WebDriverWait(driver,5)
items = wait.until(EC.presence_of_all_elements_located((By.XPATH,'//div[@class="workDetails"]')))
for item in items:
    title = item.find_element(By.CLASS_NAME,'work-title').text

print("title = ",title)

score 2 · Answer 1 · answered Oct 19 '22 at 14:49

2

There is nothing in page_source you have saved. you have to wait for some time

#iterate the the list of elements if there are more than one elements

answered Oct 19 '22 at 14:49

score 2 · Answer 2 · answered Oct 19 '22 at 15:05

2

I there are more than one elements with same class then be sure you are locating the right element.

Find the elements list
iterate the list
locate your required element

answered Oct 19 '22 at 15:05

olpoi

21
2

score 1 · Answer 3 · answered Oct 12 '22 at 09:14

There are several issues here:

You are locating a wrong element.
There is only 1 element matching '//div[@class="workDetails"]'.
Also, instead of presence_of_all_elements_located you should use visibility_of_all_elements_located there.
The print("title = ",title) should be done inside the for loop block. Otherwise it's content will be overwritten each loop iteration and only the last value will be finally printed.

The following code works:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument("start-maximized")

webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(options=options, service=webdriver_service)
wait = WebDriverWait(driver, 10)

url = "https://openlibrary.org/works/OL7960560W/Eyewitness?edition=ia%3Acowboy0000murd_y0x0"

driver.get(url)
titles = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, '.book .title>a')))
for title in titles:
    print(title.text)

The output is:

Eyewitness: Cowboy (Eyewitness Books)
Eyewitness: Horse (Eyewitness Books)
Eyewitness: Goya (Eyewitness Books)

I used CSS Selector, but XPath can be used as well here.

Barry the Platipus · Answer 4 · 2022-10-12T09:33:31.870

Here is a way of locating those elements, a bit more reliably:

[..]
from selenium.webdriver.support.ui import Select
    [...]
wait = WebDriverWait(driver, 20)
url = "https://openlibrary.org/works/OL7960560W/Eyewitness?edition=ia%3Acowboy0000murd_y0x0e"
driver.get(url)

select_editions_number = Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//select[@name="editions_length"]'))))
select_editions_number.select_by_visible_text("All")

items = wait.until(EC.presence_of_all_elements_located((By.XPATH,'//table[@id="editions"]//div[@class="title"]/a')))
for i in items:
    print(i.text)

Result in terminal:

Eyewitness: Cowboy (Eyewitness Books)
Eyewitness: Horse (Eyewitness Books)
Eyewitness: Goya (Eyewitness Books)
Eyewitness: Seashore (Eyewitness Books)

user11717481 · Answer 5 · 2022-10-22T11:31:07.010

you can specify for an absoluted path by /html/body/div[@class='workDetails'], and use polling2 module to request titles from <table id="editions" class="editions-table editions-table--progressively-enhanced">

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import polling2
...

items = Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, /html/body/div[@class='workDetails']))) 
titles = polling2.poll(lambda: WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.XPATH,'//table[@id="editions"]//div[@class="title"]/a'))), step=0.5, timeout=7)

I also recommend checking the Wait variants, so that you can use the more efficient method of short wait time or sleep, for example if you use element_to_be_clickable Expectation for checking an element is visible and enabled such that you can click it. the element will not return a valid value, so you could extract it from the DOM without seeing it, for example visibility_of_element_located An expectation for checking that an element is present on the DOM of a page and visible.

got empty result in scraping a record

5 Answers5