I'm analyzing a script written by someone else. This is a snippet of it:
def chromedriver():
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(options=options)
driver.implicitly_wait(5)
driver.maximize_window()
return driver
def pre_scraping(driver, alpha,beta):
soup = BeautifulSoup(driver.execute_script('return document.documentElement.outerHTML'), 'html.parser')
deck = soup.find('div', {'id':'mainSummary'})
cards = deck.find_all('div', {'class' : 'cp-tile'})[alpha:beta]
Instead of getting the source of the current page with the help of driver.page_source obtains the outer HTML (tag included) with the help of driver.execute_script('return document.documentElement.outerHTML'). I found entries on stackoverflow suggesting that for websites whose content changes quite quickly, it is better to use driver.execute_script than driver.page_source. Can driver.execute_script serve as a quick way to refresh specific page content? What is the difference between driver.page_source and driver.execute_script in this context? I add two links related to the question:
How to get innerHTML of whole page in selenium driver?
How to check if a web page's content has been changed using Selenium's webdriver with Python?