Google is not randomizing classes as baduker mentioned. They could change some class
names over time but they're not randomizing them.
One of the reasons why you get an empty result is because you haven't specified HTTP user-agent
aka headers
, thus Google might block your request and headers
might help to avoid it. You can check what is your user-agent
here. Headers will look like this:
headers = {
'User-agent':
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
requests.get('YOUR URL', headers=headers)
Also, you don't need to use find_all()
/findAll()
or select()
since you're trying to get only one occurrence, not all of them. Use instead:
find('ELEMENT NAME', class_='CLASS NAME')
select_one('.CSS_SELECTORs')
select()
/select_one()
usually faster.
Code and example in the online IDE (note: the number of results will always differ. It just works this way.):
import requests, lxml
from bs4 import BeautifulSoup
headers = {
"User-Agent":
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
params = {
"q": "fus ro dah defenition",
"gl": "us",
"hl": "en"
}
response = requests.get('https://www.google.com/search', headers=headers, params=params)
soup = BeautifulSoup(response.text, 'lxml')
number_of_results = soup.select_one('#result-stats nobr').previous_sibling
print(number_of_results)
# About 104,000 results
Alternatively, you achieve the same thing using Google Organic Results API from SerpApi, except you don't need to figure out why certain things don't work and instead iterate over structured JSON string and get the data you want.
Code:
import os
from serpapi import GoogleSearch
params = {
"engine": "google",
"q": "fus ro dah defenition",
"api_key": os.getenv("API_KEY"),
}
search = GoogleSearch(params)
results = search.get_dict()
result = results["search_information"]['total_results']
print(result)
# 104000
Disclaimer, I work for SerpApi.