0

I am trying to export data table from https://www.tdcj.texas.gov/death_row/dr_executed_offenders.html to Python with Selenium (eventually want to copy the data into csv file with Python).I am stuck on the first line - it iterates to 7th column, not to the 10th, which is the last one.

browser = webdriver.Chrome()
action = ActionChains(browser)
browser.get('https://www.tdcj.texas.gov/death_row/dr_executed_offenders.html')
list = []
rows = browser.find_elements_by_xpath('//th[@style="text-align: center" and(@scope="col")]')
for i in range(1,len(rows)+1):
    row = browser.find_element_by_css_selector ('#content_right > div.overflow > table > tbody > tr:nth-child(1) > th:nth-child('+str(i)+')' )
    action.move_to_element (row).perform ()
    row = browser.find_element_by_css_selector ('#content_right > div.overflow > table > tbody > tr:nth-child(1) > th:nth-child('+str(i)+')' )
    content = row.text
    list.append(content)
print(list)

I get a list:

['Execution', 'Link', 'Link', 'Last Name', 'First Name', 'TDCJ\nNumber', 'Age']

but what about date, race and country? Cannot find where is the issue.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
ilo975
  • 1

1 Answers1

0

To extract the column headings from the website you can use list comprehension and you can use either of the following locator strategies:

  • Using CSS_SELECTOR:

    driver.get('https://www.tdcj.texas.gov/death_row/dr_executed_offenders.html')
    print([my_elem.text for my_elem in driver.find_elements(By.CSS_SELECTOR, "table[title='Table showing list of executed inmates']>tbody>tr th")])
    driver.quit()
    
  • Using XPATH:

    driver.get('https://www.tdcj.texas.gov/death_row/dr_executed_offenders.html')
    print([my_elem.text for my_elem in driver.find_elements(By.XPATH, "//table[@title='Table showing list of executed inmates']/tbody/tr//th")])
    driver.quit()
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Console Output:

    ['Execution', 'Link', 'Link', 'Last Name', 'First Name', 'TDCJ\nNumber', 'Age', 'Date', 'Race', 'County']
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352