Unable to retrieve the table body contents using Selenium

Question

Trying to get the contents of body in table id = mytable by putting value in registration no. But failed to get.

Tried using headers like user agents and beautifulsoup network tab form data, but failed to get.

from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver import ActionChains

url="https://rof.mahaonline.gov.in/Search/Search"

driver = webdriver.Chrome(r'C:\chromedriver.exe')
driver.get(url)

driver.find_element_by_xpath("""//*[@id="registrationnumber"]""").send_keys("MU000000001")
driver.find_element_by_xpath("""//*[@id="btnSearch"]""").click()

soup = BeautifulSoup(driver.page_source, 'html.parser')

table = soup.find('table',{'id':'myTable'})
body = table.find('tbody')
print(body)

driver.close()

Please help me to get through this help , it will be very good if this solve with beautifulsoup form data, thanks in advance.

score 0 · Answer 1 · answered Sep 11 '19 at 12:27

To get the table body contents you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

Using CSS_SELECTOR:

driver.get("https://rof.mahaonline.gov.in/Search/Search")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input#registrationnumber"))).send_keys("MU000000001")
driver.find_element_by_css_selector("button#btnSearch").click()
WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table#myTable tbody>tr td")))
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table#myTable tbody>tr"))).get_attribute("outerHTML"))

Using XPATH:

driver.get("https://rof.mahaonline.gov.in/Search/Search")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='registrationnumber']"))).send_keys("MU000000001")
driver.find_element_by_xpath("//button[@id='btnSearch']").click()
WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@id='myTable']//tbody/tr//td")))
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[@id='myTable']//tbody/tr"))).get_attribute("outerHTML"))

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Console Output:

<tr role="row" class="odd"><td>1</td><td>MU000000001</td><td>13 June      2012</td><td>CLASSIC DEVELOPERS.</td><td>PURCHASE  AND SALE OF LANDS, PLOTS, BUILDINGS AND ALL TYPE OF CIVIL WORK WITH OR WITHOUT MATERIAL, CONTRACTORS, BUILDERS, DEVELOPERS AND REDEVELOPERS OF RESIDENTIAL PREMIES, COMMERCIAL PREMISES, SHOPPING MALLS, INDUSTRIAL SHEDS ETC.</td><td>House/Building No.BELIRAM INDUSTRIAL ESTATE.,House/Building Name:25,,<br>StreetName:S.V.ROAD,,<br>Village/Town/City:DAHISAR (EAST),<br>Taluka:Mumbai(Suburban)<br>District:Mumbai Suburban,State:Maharashtra<br>Pincode:400068<br></td></tr>

score 0 · Accepted Answer · answered Sep 11 '19 at 12:55

Provide some sleep time to load the page and then take the page_source.

import time
from bs4 import BeautifulSoup
from selenium import webdriver

url="https://rof.mahaonline.gov.in/Search/Search"

driver = webdriver.Chrome(r'C:\chromedriver.exe')
driver.get(url)

driver.find_element_by_xpath("""//*[@id="registrationnumber"]""").send_keys("MU000000001")
driver.find_element_by_xpath("""//*[@id="btnSearch"]""").click()
time.sleep(3)
soup = BeautifulSoup(driver.page_source, 'html.parser')

table = soup.find('table',{'id':'myTable'})
body = table.find('tbody')
for row in body.find_all('tr'):
  tds=[td.text for td in row.find_all('td')]
  print(tds)

Output:

['1', 'MU000000001', '13 June      2012', 'CLASSIC DEVELOPERS.', 'PURCHASE  AND SALE OF LANDS, PLOTS, BUILDINGS AND ALL TYPE OF CIVIL WORK WITH OR WITHOUT MATERIAL, CONTRACTORS, BUILDERS, DEVELOPERS AND REDEVELOPERS OF RESIDENTIAL PREMIES, COMMERCIAL PREMISES, SHOPPING MALLS, INDUSTRIAL SHEDS ETC.', 'House/Building No.BELIRAM INDUSTRIAL ESTATE.,House/Building Name:25,,StreetName:S.V.ROAD,,Village/Town/City:DAHISAR (EAST),Taluka:Mumbai(Suburban)District:Mumbai Suburban,State:MaharashtraPincode:400068']

Or you can use pandas library using read_html() to get data from table.

import time
from selenium import webdriver
import pandas as pd

url="https://rof.mahaonline.gov.in/Search/Search"
driver = webdriver.Chrome(r'C:\chromedriver.exe')
driver.get(url)

driver.find_element_by_xpath("""//*[@id="registrationnumber"]""").send_keys("MU000000001")
driver.find_element_by_xpath("""//*[@id="btnSearch"]""").click()
time.sleep(3)
table=pd.read_html(driver.page_source)
print(table[0])

thank you so much, can you please solve this using only beautifulsoup that will save my time — , Sep 11 '19 at 13:29
i tried using beautifulsoup, but did not get submit button in network tab from data — , Sep 11 '19 at 13:30
Well I’ll check later if it all possible by beautiful soup only. — KunduK, Sep 11 '19 at 14:45

Unable to retrieve the table body contents using Selenium

2 Answers2