I wrote a code that can scrape google news search results. But It always scrapes just first page. How to write a loop that allows me to scrape first 2,3...n pages?
I know that In url
I need to add parameter for page, and to put all in for loop
, but I do not know how?
This code gives me headlines, paragraphs and dates of first search page:
from bs4 import BeautifulSoup
import requests
headers = {'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}
term = 'usa'
url = 'https://www.google.com/search?q={0}&source=lnms&tbm=nws'.format(term)# i know that I need to add this parameter for page, but I do not know how
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
headline_text = soup.find_all('h3', class_= "r dO0Ag")
snippet_text = soup.find_all('div', class_='st')
news_date = soup.find_all('div', class_='slp')
Also, can this logic for google news
and pages be applied to for example bing news
or yahoo news
, I mean, can I use the same parameter or is it that url
is different?