Browser code and beautifulsoup collection different

Asked Jul 05 '21 at 15:10

Active Jul 05 '21 at 15:10

Viewed 34 times

I try to parse soccerstand front page soccer matches and fail because the items I get with BeautifulSoup are really different from what I see in browser. My code is simple at the moment:

import urllib.request
from bs4 import BeautifulSoup

with urllib.request.urlopen('https://soccerstand.com/') as response:
    url_data = response.read()

soup = BeautifulSoup(url_data, 'html.parser')
print(soup.find_all('div.event__match'))

So I tried this and this failed. When I checked soup variable it turned out not to contain such divs at all, so what I get with BS is different from what I see by inspecting code on the website.

What's the reason for that? Is there any workaround?

asked Jul 05 '21 at 15:10

vdmclcv

2

Most likely some elements are loaded dinamically via JS queries. You would need to manually send said queries with requests, or better do your scraping with selenium. – Max Shouman Jul 05 '21 at 15:36
Selenium is something I still have to learn. Thanks for your suggestion, makes much sense. – vdmclcv Jul 05 '21 at 16:33
Does this answer your question? [Web-scraping JavaScript page with Python](https://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python) – ggorlen Jul 05 '21 at 22:33
I believe yes currently – vdmclcv Jul 06 '21 at 23:08

Browser code and beautifulsoup collection different

0 Answers0