You can use f
-string instead which more pythonic way in my opinion to do string
formatting:
requests.get(f"https://www.google.dz/search?q={url}")
# or
for query in queries:
html = requests.get(f"https://www.google.dz/search?q={query}")
Note that the next problem might appear because of no user-agent
specified thus Google blocked your request.
Because the default requests
user-agent
is python-requests. Google understands it and blocks a request since it's not the "real" user visit. Checks what's your user-agent.
Code:
from bs4 import BeautifulSoup
import requests, lxml
headers = {
"User-agent":
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
queries = ["Samsung S9", "Samsung S8", "Samsung Note 9"]
for query in queries:
params = {
"q": query,
"gl": "uk",
"hl": "en"
}
html = requests.get("https://www.google.com/search", headers=headers, params=params)
soup = BeautifulSoup(html.text, "lxml")
for result in soup.select('.tF2Cxc'):
title = result.select_one('.DKV0Md').text
link = result.select_one('.yuRUbf a')['href']
print(f"{title}\n{link}\n")
-------
'''
Samsung Galaxy S9 and S9+ | Buy or See Specs
https://www.samsung.com/uk/smartphones/galaxy-s9/
Samsung Galaxy S9 - Full phone specifications - GSMArena ...
https://www.gsmarena.com/samsung_galaxy_s9-8966.php
...
Samsung Galaxy S8 - Wikipedia
https://en.wikipedia.org/wiki/Samsung_Galaxy_S8
Samsung Galaxy S8 Price in India - Gadgets 360
https://gadgets.ndtv.com/samsung-galaxy-s8-4009
...
Samsung Galaxy Note 9 Cases - Mobile Fun
https://www.mobilefun.co.uk/samsung/galaxy-note-9/cases
Samsung Galaxy Note 9 - Wikipedia
https://en.wikipedia.org/wiki/Samsung_Galaxy_Note_9
'''
Alternatively, you can achieve the same thing by using Google Organic Results API from SerpApi. It's a paid API with a free plan.
The difference in your case is that you don't need to think about how to extract certain things or figure out why something isn't working as it should work. All that really needs to be done is to iterate over structured JSON and get the data you want fast without any headache.
Code to integrate:
import os
from serpapi import GoogleSearch
queries = ["Samsung S9", "Samsung S8", "Samsung Note 9"]
for query in queries:
params = {
"engine": "google",
"q": query,
"hl": "en",
"gl": "uk",
"api_key": os.getenv("API_KEY"),
}
search = GoogleSearch(params)
results = search.get_dict()
for result in results["organic_results"]:
print(result['title'])
print(result['link'])
print()
------
'''
Samsung Galaxy S9 and S9+ | Buy or See Specs
https://www.samsung.com/uk/smartphones/galaxy-s9/
Samsung Galaxy S9 - Full phone specifications - GSMArena ...
https://www.gsmarena.com/samsung_galaxy_s9-8966.php
...
Samsung Galaxy S8 - Wikipedia
https://en.wikipedia.org/wiki/Samsung_Galaxy_S8
Samsung Galaxy S8 Price in India - Gadgets 360
https://gadgets.ndtv.com/samsung-galaxy-s8-4009
...
Samsung Galaxy Note 9 Cases - Mobile Fun
https://www.mobilefun.co.uk/samsung/galaxy-note-9/cases
Samsung Galaxy Note 9 - Wikipedia
https://en.wikipedia.org/wiki/Samsung_Galaxy_Note_9
'''
Disclaimer, I work for SerpApi.