0

I'm trying to get the URL of a video, but every time it doesn't show in my output. I try request, urllib and even selenium, but it just doesn't show part of the code in my result, it's like it is blocked.

The url is https://unitplay.net/tt0089222, and here is my code:

from selenium import webdriver

browser=webdriver.Chrome('path/chromedriver.exe')

type(browser)

browser.get('https://unitplay.net/tt0089222')

elem = browser.page_source

print(elem)

browser.quit()

Here is the part it doesn't show and I want to get the src from it:

<div class="jw-media jw-reset"><video class="jw-video jw-reset" x-webkit-airplay="allow" webkit-playsinline="" playsinline="" preload="auto" jw-loaded="data" src="https://unitplay.net//file/others/DA6BB292BA130B6A825B62B96BD929F811EBF7BFEC748F8E2609004F5D96D0F5DD7025F4450289E31279E9F621883D048C869F15520DBE571D8FA35EBCCACD75" __idm_id__="64900097" jw-played=""></video></div>
Micheal O'Dwyer
  • 1,237
  • 1
  • 16
  • 26
Daniel
  • 1
  • 2

1 Answers1

0

You can wait for the element to appear using selenium.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

browser = webdriver.Chrome('path/chromedriver.exe')

browser.get('https://unitplay.net/tt0089222')

elem = browser.page_source

try:
    element = WebDriverWait(browser, 10).until(
        EC.presence_of_element_located((By.TAG_NAME, "video"))
    )

    print(element.get_attribute("src"))
finally:
    browser.quit()

This should tell selenium to wait up to 10 seconds for a video element to appear and then print out it's source.

Alex
  • 963
  • 9
  • 11
  • NameError: name 'driver' is not defined – Daniel Jul 18 '19 at 18:21
  • thanks i figure driver is browser and it work i try similar code but did not figure out the browser have to wait thank you – Daniel Jul 18 '19 at 18:29
  • another question can you do the same thing using requests – Daniel Jul 18 '19 at 19:36
  • @Daniel sorry about not catching that. I was looking at code where driver was browser. You might be able to do something with [`requests-html`](https://pypi.org/project/requests-html/) however I'm not too familiar with that. [This answer](https://stackoverflow.com/a/54056631/5348961) seems like it might help. – Alex Jul 18 '19 at 21:43
  • i try request-html but it is not working it is not let the page load look at my code from requests_html import HTMLSession session = HTMLSession() r = session.get('https://unitplay.net/tt0119643') r.html.render() pageSource = r.text print(pageSource) – Daniel Jul 19 '19 at 21:25
  • @Daniel I unfortunately cannot get it to work either. Looks like you just have to use `Selenium` for now. – Alex Jul 19 '19 at 23:18