I have been having a consistent issue during webscraping of receiving an empty string instead of the expected results (based on inspect page html).
My specific goal is to get the link for the top 10 clips from https://www.twitch.tv/directory/game/Overwatch/clips?range=7d.
Here is my code:
# Gathers links of clips to download later
import bs4
import requests
from selenium import webdriver
from pprint import pprint
import time
from selenium.webdriver.common.keys import Keys
# Get links of multiple clips by webscraping main_url
main_url = 'https://www.twitch.tv/directory/game/Overwatch/clips?range=7d'
driver = webdriver.Firefox()
driver.get(main_url)
time.sleep(10)
elements_found = driver.find_elements_by_class_name("tw-interactive tw-link tw-link--hover-underline-none tw-link--inherit")
print(elements_found)
driver.quit()
This is how I decided on the class name
The page uses Javascript and that is the reason why I am using Selenium over the Requests module (which I tried, to no success).
I added the time.sleep(10) so that I have time to scroll through the webpage to activate the java script, to no avail.
I've also tried changing user-agent and using XPaths, neither of which have produced different results.
No matter what I do, it seems that the program only looks at the raw html that is found by right-click -> inspect page source.
Any help and pointers would be greatly appreciated, I feel thoroughly stuck on this problem. I have been having these issues in all projects of "Chapter 11: Webscraping" from Automate the Boring Stuff, and my personal projects.