Python: Can't extract element from website with bs4

Question

I'm trying to extract an element from this site. More specifically, I am trying to extract the temperature.

This is the following element I am attempting to extract using BeautifulSoup4:

<p class="temperature">-1<span>°C</span></p>

The following is my python code that is supposed to extract the element from the mentioned site:

import requests
from bs4 import BeautifulSoup

url = requests.get('https://www.theweathernetwork.com/ca/weather/ontario/mississauga')

soup = BeautifulSoup(url.content, 'lxml')
 
print(soup.find_all('p', {'class':'temperature'}))

And it just returns an empty array.

[]

I would be really appreciative if anyone could help me with this.

Note: I am new to python

The detail you want is loaded via javascript so python-requests is not enough. It's coming out as empty because it **is** empty. What you're doing is web scraping. http://stackoverflow.com/questions/26393231/using-python-requests-with-javascript-pages — munsu, Mar 17 '17 at 01:56
I see. So what library do you recommend I use to extract the data? — Curious Spider, Mar 17 '17 at 02:04

score 0 · Accepted Answer · answered Mar 17 '17 at 02:13

0

Okay, so as @RobinAnupol mentioned, you have several options depending on how similar you want to be to a real browser.

Open the website manually on a browser and observe the api calls the site does with javascript code. Replicate them using requests in python
Use a javascript rendering engine like splash
Use selenium with a real browser (there drivers for chrome, ie, firefox, phantomjs etc)

answered Mar 17 '17 at 02:13

Giannis Spiliopoulos

2,628
18
27

I just tested it out with selenium, and it works just like planned, it is lower compared to requests, however that could be because the text I am trying to extract is in javascript and not in HTML. – Curious Spider Mar 17 '17 at 02:35
That's great. If you want accept this answer so that the question doesn't appear as unanswered – Giannis Spiliopoulos Mar 17 '17 at 02:37

Python: Can't extract element from website with bs4

1 Answers1