I have multiple cases of table rows from which I want to extract data:
Case 1
Onsite Service After Remote Diagnosis April 19, 2014 April 19, 2017
Case 2
CAR October 15, 2016 October 15, 2017
Onsite Service After Remote Diagnosis October 15, 2016 October 15, 2019
Case 3
NBD ProSupport July 16, 2008 July 15, 2011
Onsite Service After Remote Diagnosis July 16, 2008 July 15, 2011
The information that I need to be extracted is on the rows that contain "Onsite Service After Remote Diagnosis" on the second td, which will be for every case the date on the right of the row
Expected output:
April 19, 2017
October 15, 2017
July 15, 2011
My code:
from selenium import webdriver
import time
from openpyxl import load_workbook
driver = webdriver.Chrome()
def scrape(codes):
dates = []
for i in range(len(codes)):
driver.get("https://www.dell.com/support/home/us/en/19/product-support/"
"servicetag/%s/warranty?ref=captchasuccess" % codes[i])
# Solve captcha manually
if i == 0:
print("You now have 120\" seconds to solve the captcha")
time.sleep(120)
print("120\" Passed")
# Extract data
expdate = driver.find_element_by_css_selector("#printdivid > div > div.not-annotated.hover > table:nth-child(3) > tbody > tr > td:nth-child(3)")
print(expdate.get_attribute('innerText'))
driver.close()
codes = ['159DT3J', '15FDBG2', '10V8YZ1']
scrape(codes)
My output:
April 19, 2014
October 15, 2016
July 16, 2008
Taken from the first row that appears and the first td
I've tried changing tbody > tr > td:nth-child(3)
but identifying based on the text would be better and avoid errors.