0

i made a progam to scrap attributes from single record from web but i am getting nothing in my variables below is what i tried. I am unable to understand where my logic is wrong

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome(executable_path='chromedriver.exe')
url = "https://openlibrary.org/works/OL7960560W/Eyewitness?edition=ia%3Acowboy0000murd_y0x0"
global title
driver.get(url)
wait = WebDriverWait(driver,5)
items = wait.until(EC.presence_of_all_elements_located((By.XPATH,'//div[@class="workDetails"]')))
for item in items:
    title = item.find_element(By.CLASS_NAME,'work-title').text

print("title = ",title)
Prophet
  • 32,350
  • 22
  • 54
  • 79
lokp
  • 53
  • 12

5 Answers5

2

There is nothing in page_source you have saved. you have to wait for some time

#iterate the the list of elements if there are more than one elements
2

I there are more than one elements with same class then be sure you are locating the right element.

Find the elements list
iterate the list
locate your required element
olpoi
  • 21
  • 2
1

There are several issues here:

  1. You are locating a wrong element.
    There is only 1 element matching '//div[@class="workDetails"]'.
  2. Also, instead of presence_of_all_elements_located you should use visibility_of_all_elements_located there.
  3. The print("title = ",title) should be done inside the for loop block. Otherwise it's content will be overwritten each loop iteration and only the last value will be finally printed.

The following code works:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument("start-maximized")

webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(options=options, service=webdriver_service)
wait = WebDriverWait(driver, 10)

url = "https://openlibrary.org/works/OL7960560W/Eyewitness?edition=ia%3Acowboy0000murd_y0x0"

driver.get(url)
titles = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, '.book .title>a')))
for title in titles:
    print(title.text)

The output is:

Eyewitness: Cowboy (Eyewitness Books)
Eyewitness: Horse (Eyewitness Books)
Eyewitness: Goya (Eyewitness Books)

I used CSS Selector, but XPath can be used as well here.

Prophet
  • 32,350
  • 22
  • 54
  • 79
1

Here is a way of locating those elements, a bit more reliably:

[..]
from selenium.webdriver.support.ui import Select
    [...]
wait = WebDriverWait(driver, 20)
url = "https://openlibrary.org/works/OL7960560W/Eyewitness?edition=ia%3Acowboy0000murd_y0x0e"
driver.get(url)

select_editions_number = Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//select[@name="editions_length"]'))))
select_editions_number.select_by_visible_text("All")

items = wait.until(EC.presence_of_all_elements_located((By.XPATH,'//table[@id="editions"]//div[@class="title"]/a')))
for i in items:
    print(i.text)

Result in terminal:

Eyewitness: Cowboy (Eyewitness Books)
Eyewitness: Horse (Eyewitness Books)
Eyewitness: Goya (Eyewitness Books)
Eyewitness: Seashore (Eyewitness Books)
Barry the Platipus
  • 9,594
  • 2
  • 6
  • 30
1

you can specify for an absoluted path by /html/body/div[@class='workDetails'], and use polling2 module to request titles from <table id="editions" class="editions-table editions-table--progressively-enhanced">

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import polling2
...

items = Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, /html/body/div[@class='workDetails']))) 
titles = polling2.poll(lambda: WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.XPATH,'//table[@id="editions"]//div[@class="title"]/a'))), step=0.5, timeout=7) 

I also recommend checking the Wait variants, so that you can use the more efficient method of short wait time or sleep, for example if you use element_to_be_clickable Expectation for checking an element is visible and enabled such that you can click it. the element will not return a valid value, so you could extract it from the DOM without seeing it, for example visibility_of_element_located An expectation for checking that an element is present on the DOM of a page and visible.

user11717481
  • 1
  • 9
  • 15
  • 25