I want to extract some dates from Dell's website in my interest for my devices.
I tried to download the webpages using urllib
but it's protected by captcha and I can't bypass that for now.
Now I am using Selenium to open a browser, solve manually the capthca and then automatically opening the pages and extracting the dates.
The problem is that the css selector is returning some weird elements instead of the desired output
My code:
from selenium import webdriver
import time
driver = webdriver.Chrome()
def scrape(codes):
dates = []
for i in range(len(codes)):
driver.get("https://www.dell.com/support/home/us/en/19/product-support/"
"servicetag/%s/warranty?ref=captchasuccess" % codes[i])
# Solve captcha manually
if i == 0:
print("You now have 120\" seconds to solve the captcha")
time.sleep(120)
print("120\" Passed")
# Extract data
expdate = driver.find_element_by_css_selector("#printdivid > div > div.not-annotated.hover > table:nth-child(3) > tbody > tr > td:nth-child(3)")
print(expdate)
driver.close()
codes = ['1FMR762', '15FDBG2', '10V8YZ1']
scrape(codes)
Expected output:
June 22, 2018
October 15, 2017
April 19, 2017
Given output:
<selenium.webdriver.remote.webelement.WebElement (session="d83af0f7a3a9c79307d2058f863a7ecb", element="0.21873872382745052-1")>
<selenium.webdriver.remote.webelement.WebElement (session="d83af0f7a3a9c79307d2058f863a7ecb", element="0.06836824093097027-1")>
<selenium.webdriver.remote.webelement.WebElement (session="d83af0f7a3a9c79307d2058f863a7ecb", element="0.6642161898702734-1")>