Weather data is not rendered with JavaScript as Kostas Charitidis mentioned.
You don't need to specify <span>
element, and more over you don't need to use find_all()
/findAll()
/select()
since you're looking just for one element that doesn't repeat anywhere else. Use select_one()
instead:
soup.select_one('#wob_tm').text
# prints temperature
You can also use try/except
if you want to return None
:
try:
temperature = soup.select_one('#wob_tm').text
except: temperature = None
An if
statement always costs you, it's nearly free to set up a try/except
block. But when an Exception
actually occurs, the cost is much higher.
The next problem that might cause that error would be no user-agent
specified so Google would block your request eventually thus you'll receive a completely different HTML. I already answered about what is user-agent
.
Code and full example in the online IDE:
from bs4 import BeautifulSoup
import requests, lxml
headers = {
"User-Agent":
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
params = {
"q": "london weather",
"hl": "en",
"gl": "us"
}
response = requests.get('https://www.google.com/search', headers=headers, params=params)
soup = BeautifulSoup(response.text, 'lxml')
weather_condition = soup.select_one('#wob_dc').text
tempature = soup.select_one('#wob_tm').text
precipitation = soup.select_one('#wob_pp').text
humidity = soup.select_one('#wob_hm').text
wind = soup.select_one('#wob_ws').text
current_time = soup.select_one('#wob_dts').text
print(f'Weather condition: {weather_condition}\n'
f'Temperature: {tempature}°F\n'
f'Precipitation: {precipitation}\n'
f'Humidity: {humidity}\n'
f'Wind speed: {wind}\n'
f'Current time: {current_time}\n')
----
'''
Weather condition: Mostly cloudy
Temperature: 60°F
Precipitation: 3%
Humidity: 77%
Wind speed: 3 mph
Current time: Friday 7:00 AM
'''
Alternatively, you can achieve this by using the Google Direct Answer Box API from SerpApi. It's a paid API with a free plan.
The difference in your case is that you don't have to figure out how to extract elements since it's already done for the end-user and no need to maintain a parser over time. All that needs to be done is just to iterate over structured JSON and get what you were looking for.
Code to integrate:
from serpapi import GoogleSearch
import os
params = {
"engine": "google",
"q": "london weather",
"api_key": os.getenv("API_KEY"),
"hl": "en",
"gl": "us",
}
search = GoogleSearch(params)
results = search.get_dict()
loc = results['answer_box']['location']
weather_date = results['answer_box']['date']
weather = results['answer_box']['weather']
temp = results['answer_box']['temperature']
precipitation = results['answer_box']['precipitation']
humidity = results['answer_box']['humidity']
wind = results['answer_box']['wind']
print(f'{loc}\n{weather_date}\n{weather}\n{temp}°F\n{precipitation}\n{humidity}\n{wind}\n')
-------
'''
District 3
Friday
Mostly sunny
80°F
0%
52%
5 mph
'''
Disclaimer, I work for SerpApi.