Currently I'm attempting to scrape Metacritic for new video game releases, gathering the titles of games and their respective scores. The problem I am facing involves each score on the website being assigned multiple classes in the HTML. Each score has been assigned 4 different classes, and I only wish to specify 3.
Example: <div class="metascore_w large game positive">80</div>
Elements containing metascore_w, large, and game are what I wish to collect. In particular, game is essential because without this class, it returns unhelpful miscellaneous scores such as movies, tv shows, and music.
The class positive cannot be used because it only specifies positive reviews, when I also want to collect mixed and negative reviews as well (which have their class name as such.) Though I would prefer to not have to specify positive, mixed, and negative for simplicity's sake, if it must be done I will gladly do so.
The specific issue I am facing is a head-scratcher. If I specify the starting class, it outputs just fine:
scores = soup.find_all('div', {'class': 'metascore_w'}) print(scores)
[<div class="metascore_w medium game positive">90</div>, <div class="metascore_w medium movie positive">80</div] (etc)
If I specify all 4 classes, it outputs just fine as well:
scores = soup.find_all('div', {'class': 'metascore_w large game positive'}) print(scores)
<div class="metascore_w large game positive">80</div>, <div class="metascore_w large game positive">84</div> (etc)
But when I specify 3 classes, I receive no output:
scores = soup.find_all('div', {'class': 'metascore_w large game'}) print(scores)
[]
If anyone has any idea how I could solve this problem, I would greatly appreciate it! Thank you for reading!