So I'm trying to grab the headline and articles summary off of this website and I so far I know how to get headlines that are within article tags > h2 tags> a tags but I'm not sure how to get the headline when there's multiple div tags within this article tag. I've left the articles link below so you can hopefully see what I mean. Usually I'd go headline = article.h2.a.text but this has article tag has 2 div tags and it's very frustrating to not know how to tackle this at all. My thought process for this was to start by specifying the article tag and then the div tag I wanted to access followed by the h1 tag that holds the headline text but that didn't work. I'd imagine this is the correct way of viewing this problem but I'm just not going about it properly. I know I'm definitely missing something but I just don't know what. Any help or resources would be extremely helpful.
ARTICLE: https://www.huffpost.com/entry/angry-squirrel-attacks-queens_n_5fee30b1c5b6ec8ae0b242d2
Here's my code:
from bs4 import BeautifulSoup
import requests import csv
source = requests.get('https://www.huffpost.com/entry/angry-squirrel-attacks-queens_n_5fee30b1c5b6ec8ae0b242d2').text
soup = BeautifulSoup(source, 'lxml')
article = soup.find('article')
headline = article.find('div', class_='headline js-headline').h1.text
print(headline)
Error:
Traceback (most recent call last): File "C:\Users\Denze\MyPythonScripts\Webscraping learning\Webscrape article.py", line 12, in headline = article.find('div', class_='headline__title cc_cursor').h1.text AttributeError: 'NoneType' object has no attribute 'find'