I'm trying to extract all the url under the url : https://www.scotts.com/en-us/library/lawn-food
I have realized is that it does not returns few urls such as https://www.scotts.com/en-us/library/lawn-food/when-feed-greener-lawn and few more
I have mentioned below my code snippet:
import time
from random import randint
import requests
from bs4 import BeautifulSoup, SoupStrainer
import re
def scrape_google_summaries(url):
time.sleep(randint(0, 2)) # relax and don't let google be angry
r = requests.get(url)
content = r.text
soup = BeautifulSoup(content, "html.parser",parse_only=SoupStrainer('a', href=True))
summary=[]
for link in soup:#.find_all('a'):
summary.append(link.get('href'))
return summary
output = scrape_google_summaries("https://www.scotts.com/en-us/library/lawn-food")