-2

I'm trying to scrape a website but my BeautifulSoup returns: 'NoneType' object has no attribute 'get_text'. However, the element does exist.

Here is my code:

headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get(url, headers=headers)
#time.sleep(60)
soup = BeautifulSoup(response.content, 'html.parser')
NumArticle = url.split('/')[-2]
titreArticle = soup.find("h1", {"class":"wi-article-title article-title-main"}).get_text()

I used the: headers = {'User-Agent': 'Mozilla/5.0'} response = requests.get(url, headers=headers) because otherwise I would get a 403 error. And I tried to use a time.sleep(x) because I saw on a forum that it could solve the problem but in my case it didn't work.

Do you have any idea how I could solve this problem?

1 Answers1

0

I think your issue lies with the way you are searching for the two different classes. Try refering to this question.

soup.find('h1', class_=['wi-article-title', 'article-title-main'])
daudprobst
  • 132
  • 8
  • I also have the problem when I try to scrape a simple class like : ```disclosures = soup.find("section", {"class": "sec"}).get_text() ``` But your solution for multiple classes seems to work. – user60005003 May 11 '22 at 14:22