I am new to bs4!
I have looked up many tutorials but nothing will work... I want to scrape the mp4 file from a site but the embedded stuff looks different than on the tutorials... I have tried the find and find_all function but cant get it to work. Can anyone help?
<div class="rmp-playlist-container">
<div class="rmp-playlist-player-wrapper">
<div id="rmpPlayer"></div>
</div>
</div>
<p><script>var playlistData = [{src: {mp4:["https://wantedurl.mp4"]},"contentMetadata": {"title": "video1", "thumbnail":"https://somethumbnail.jpg","poster": [ "https://someposter.jpg"]}
current code:
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36',
'From': 'tikkanenfelix@gmail.com' # This is another valid field
}
base_url = "url"
r = requests.get(base_url,headers=headers)
patt = re.compile(r'mp4:\s*\["(.+?)"\]')
soup = BeautifulSoup(r, 'html.parser')
print(soup)
for e in soup.find_all('script'):
m = patt.search(e.string)
if m:
print(m.group(1))