I´m new to scraping dynamically loaded websites and I´m stuck at trying to scrape the teamnames and odds of this website
https://www.cashpoint.com/de/fussball/deutschland/bundesliga
I tried it with PyQt5 like in this post
PyQt4 to PyQt5 -> mainFrame() deprecated, need fix to load web pages
class Page(QWebEnginePage):
def __init__(self, url):
self.app = QApplication(sys.argv)
QWebEnginePage.__init__(self)
self.html = ''
self.loadFinished.connect(self._on_load_finished)
self.load(QUrl(url))
self.app.exec_()
def _on_load_finished(self):
self.html = self.toHtml(self.Callable)
print('Load finished')
def Callable(self, html_str):
self.html = html_str
self.app.quit()
def main():
page = Page('https://www.cashpoint.com/de/fussball/deutschland/bundesliga')
soup = bs.BeautifulSoup(page.html, 'html.parser')
js_test = soup.find('div', class_='game__team game__team__football')
print(js_test.text)
if __name__ == '__main__': main()
But it did not work for the website I want to scrape. I´m getting a,
AttributeError: 'NoneType' object has no attribute 'text' Error
. I´m not getting the content of the site with this method, although in the post above there a method written for dynamically loaded websites. As I have read, the first approach when dealing with dynamically loaded websites is to identify how the data is rendered on the page. How do I do that and why isn´t PyQt5 working for this website? The way with Selenium isn´t an option for me since it would be too slow to get live odds. Can I get the html content of the site as it is shown when I inspect the site to use it then the normal way with Beautifulsoup or Scrapy? Thank you in advance.