I'm learning the very basics of Web Scraping by following Chapter 12 of Automate the boring stuff with Python, but I'm having an issue with the find_element() method. When I use the method to look for an element with the class name 'card-img-top cover-thumb', the method doesn't return any matches. However, the code does work for URL's other than the example in the book.
I have had to make quite a few changes to the code as-written in order to get the code to do anything. I've posted the full code on GitHub HERE, but to summarise:
The book says to use 'find_element_by_*' methods, but these were producing depreciation messages that directed me to use find_element() instead.
To use this other method, I import 'By'.
I also import 'Service' from 'Selenium.Webdriver.Chrome.Service' because Chromedriver doesn't work otherwise.
I also define options with Webdriver.ChromeOptions() that hide certain error messages about a faulty device which apparently you're just supposed to ignore?
I put the code from the book into a function with 'url' and 'classname' arguments so I can test different url's without having to edit the code repeatedly.
Here is the 'business-part' of the code:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
s=Service(r'C:\Users\antse\AppData\Local\Chrome_WebDriver\chromedriver.exe')
op = webdriver.ChromeOptions()
op.add_experimental_option('excludeSwitches', ['enable-logging'])
def FNC_GET_CLASS_ELEMENT_FROM_PAGE(URL, CLASSNAME):
browser = webdriver.Chrome(service = s, options = op)
browser.get(URL)
try:
elem = browser.find_element(By.CLASS_NAME, CLASSNAME)
print('Found <%s> element with that class name!' % (elem.tag_name))
except:
print('Was not able to find an element with that name.')
FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top cover-thumb')
Expected output: Found <img> element with that class name!
Since the code does work when I look at a site like Wikipedia, I wonder if there have been changes to the html of the page that prevents the scrape from working properly?
Link to the book chapter HERE.
I appreciate any advice you can give me!