https://www.narendramodi.in/category/text-speeches -> I wanted to scrape this page. As this a dynamic one, I need to scroll down until the bottom of the page and then get the HTML content to scrape it. But when this website is opened through selenium chrome web driver, neither manually nor automatically is the website loading dynamically as I scroll down. When the website is opened from normal chrome, it works just fine. I even tried with firefox driver and the result is same. Here's the code that I have tried out.
driver = webdriver.Chrome(executable_path=r'C:/tools/drivers/chromedriver.exe')
driver.get('https://www.narendramodi.in/news')
# https://stackoverflow.com/a/27760083
SCROLL_PAUSE_TIME = 2.0
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
print(last_height)
while True:
# Scroll down to bottom
time.sleep(SCROLL_PAUSE_TIME)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
print(new_height)
if new_height == last_height:
break
last_height = new_height
res = driver.execute_script("return document.documentElement.outerHTML")
driver.quit()
soup = BeautifulSoup(res, 'lxml')
How can I scrape this entire page?