I'm scraping my site which uses a Google custom search iframe. I am using Selenium to switch into the iframe, and output the data. I am using BeautifulSoup to parse the data, etc.
from bs4 import BeautifulSoup
from selenium import webdriver
import time
import html5lib
driver = webdriver.Firefox()
driver.get('http://myurl.com')
driver.execute_script()
time.sleep(4)
iframe = driver.find_elements_by_tag_name('iframe')[0]
driver.switch_to_default_content()
driver.switch_to_frame(iframe)
output = driver.page_source
soup = BeautifulSoup(output, "html5lib")
print soup
I am successfully getting into the iframe and getting 'some' of the data. At the very top of the data output, it talks about Javascript being enabled, and the page being reloaded, etc. The part of the page I'm looking for isn't there (from when I look at the source via developer tools). So, obviously some of it isn't loading.
So, my question - how do you get Selenium to load ALL page javascripts? Is it done automatically?
I see a lot of posts on SO about running an individual function, etc... but nothing about running all of the JS on the page.
Any help is appreciated.