So I am trying to scrape a webs table using selenium trying to extract the table with xpath:
previously I tried to look for the table class however no tables where found , so I decided to look for the div element.
xpath="//div[@class='table-scroller ScrollableTable__table-scroller QuoteHistoryTable__table__scroller QuoteHistoryTable__QuoteHistoryTable__table__scroller']"
WebDriverWait(driver, 10).until(
expected_conditions.visibility_of_element_located((By.XPATH, xpath)))
source = driver.page_source
driver.quit()
soup = BeautifulSoup(source, "html5lib")
table = soup.find('div', {'class': 'table-scroller ScrollableTable__table-scroller QuoteHistoryTable__table__scroller QuoteHistoryTable__QuoteHistoryTable__table__scroller'})
df = pd.read_html(str(table), flavor='html5lib', header=0, thousands='.', decimal=',')
print(df[0])
The issue I am having is that I am printing only the headers and a first row of values full of nans
:
Why am I not getting the values of the table? What it makes it so tough to scrape this content?
EDIT: @DebanjanB was able to provide a nice answer however I am unable to replicate the output, whats the reason behind this?