I am a rookie programmer and I am teaching myself some webscraping. I am trying to make a Python program that returns the direct video download URL from an embedded player by scraping a webpage with selenium.
So here's the relevant html for the webpage:
<video class="vjs_tech" id="olvideo_html5_api" crossorigin="anonymous"></video>
<button class="vjs-big-play-button" type="button" aria-live="polite" title="Play Video" aria-disabled="false"><span class="vjs-control-text">Play Video</span></button>
The video element initially does not have a src attribute. But when I click the above button on my browser, the page seems to run a few javascripts and the video element gets an src attribute. I want to print the contents of this src attribute to the monitor. So this is how I replicated this process in python:
#Clicking the Button
playbutton = driver.find_element_by_tag_name('button')
playbutton.send_keys(Keys.RETURN)
#Selecting the Video Element
wait = WebDriverWait(driver, 5)
video = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'video')))
#Printing the details of the Video Element
print "Class: ", video.get_attribute("class")
print "ID: ", video.get_attribute("id")
print "SRC: ", video.get_attribute("src")
The output looks like this:
Class: vjs_tech
ID: olvideo_html5_api
SRC:
As you can see, I can get the 'class' and 'id' info accurately but the 'src' tag always returns empty. But if I use Chrome to open the site and click the button manually, I can see that the src field gets populated as expected.
What am I doing wrong? How can I get the src attribute to show up in my output?
(Im using Selenium with ChromeDriver on Python27.)