1
from selenium import webdriver

driver = webdriver.Chrome(executable_path="D:\chromedriver.exe")
#url = 'https://www.dcrustedp.in/show_chart.php'
driver.get('https://www.dcrustedp.in/show_chart.php')

rows = 2
cols = 5

for r in range(5,rows+1):
    for c in range(6,cols+1):
        value = driver.find_element_by_xpath("/html/body/center/table/tbody/tr["+str(r)+"]/td["+str(c)+"]").text
        print(value)

` This is my code. I want to extract result date of B.Tech - Computer Science and Engineering 5th Semester. It is in the first row of table. The date is 24-02-2020. I want to print the date from that particular cell only.

  • According to the your for loop of 'r', it starts from 5 and finishes on 3 (rows+1). Also same problem in 'c' loop as starts from 6 and finishes on 6 (cols+1). You need to change these intervals (rows+1, 6) and (cols+1,7). – Orhan Solak Feb 26 '20 at 23:45
  • find by xpath is actually a method from selenium. However, there is a library etree which can provide a similar functionality. You can refer to this link. Hope this helps. https://stackoverflow.com/questions/11465555/can-we-use-xpath-with-beautifulsoup – Prakhar Jhudele Feb 28 '20 at 06:04

2 Answers2

1

The below code works-:

from selenium import webdriver
from bs4 import BeautifulSoup
import time
webpage = 'https://www.dcrustedp.in/show_chart.php'
driver = webdriver.Chrome(executable_path='Your/path/to/chromedriver.exe') 
driver.get(webpage)
time.sleep(15)
html = driver.page_source

soup = BeautifulSoup(html, "html.parser")

pagehits=driver.find_element_by_xpath("/html/body/center/table/tbody/tr[3]/td[5]")
print(pagehits.text)

driver.quit()

Without Selenium, we can use requests library to fetch the table and then respective element

import requests
import pandas as pd
url = 'https://www.dcrustedp.in/show_chart.php'
html = requests.get(url, verify=False).content
df_list = pd.read_html(html)
df = df_list[-1]
print(df.iat[0,4])
Prakhar Jhudele
  • 955
  • 1
  • 7
  • 14
  • Thank you Sir ! The code worked perfectly as per my requirement. Also, can you please help me to find a basic-simple code to just print the text from the pre-defined xpath. I don't want the chrome browser to open. – Gaurav Sekhri Feb 28 '20 at 04:58
  • @GauravSekhri find by xpath is actually a method from selenium. However, there is a library etree which can provide a similar functionality. You can refer to this link. Hope this helps. https://stackoverflow.com/questions/11465555/can-we-use-xpath-with-beautifulsoup – Prakhar Jhudele Feb 28 '20 at 06:10
  • Can you please make some changes in your code so that the web browser doesn't open each time I run the code? – Gaurav Sekhri Mar 04 '20 at 13:15
  • @GauravSekhri I have made an edit to the original answer. Also please click upvote if the changes work for you! – Prakhar Jhudele Mar 05 '20 at 04:42
  • Thanks a lot for helping me. Also, when I try to upvote your answer, a pop-up is displayed ("Thanks for the feedback! Votes cast by those with less than 15 reputation are recorded, but do not change the publicly displayed post score.") – Gaurav Sekhri Mar 19 '20 at 08:24
  • Is it possible that if I pre-define the text of a cell and the whole code prints the text of adjacent cell on compilation? For example, if I pre-define the text "001 : B.Tech - Computer Science and Engineering" using findall or any other method, and when I try to compile, the code prints "24-02-2020" which is in Column5. If its possible then don't edit the code that you have already written, just write another set of lines. – Gaurav Sekhri Mar 19 '20 at 08:33
  • Hi @GauravSekhri I would suggest you open a separate question with input your trial code and required output – Prakhar Jhudele Mar 19 '20 at 13:51
0

To extract the result date of 5th Semester for any of the Prg. Title, you have to induce WebDriverWait for the visibility_of_element_located() and you can use the following Locator Strategy:

  • xpath:

    driver.get('https://www.dcrustedp.in/show_chart.php')
    prg_title = "B.Tech - Computer Science and Engineering"
    # prg_title = "B.Tech - Electrical Engineering"
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//td[contains(., '"+prg_title+"')]//following-sibling::td[3]"))).get_attribute("innerHTML"))
    
  • Console Output:

    24-02-2020
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352