0

What I am trying is to get ingredients section from

https://www.walmart.com/ip/Nature-s-Recipe-Chicken-Wild-Salmon-Recipe-in-Broth-Dog-Food-2-75-oz/34199310

so what i did was

import requests
from bs4 import BeautifulSoup
x=requests.get("https://www.walmart.com/ip/Nature-s-Recipe-Chicken-Wild-Salmon-Recipe-in-Broth-Dog-Food-2-75-oz/34199310")
soup=BeautifulSoup(x.content)
print(soup.find_all("p",{"class":"Ingredients"})[0])

But its showing index out of range,i.e. no element found but on checking the website the element do exist 'p class="Ingredients"'

Nimish Bansal
  • 1,719
  • 4
  • 20
  • 37

1 Answers1

1

Bad news, looks like those elements are generated via JS. If you "view source" of that page, the elements aren't there, and this is the html that requests is getting.

I would use something like selenium to automate a browser to get the fully rendered html, then you can use beautifulsoup to parse out the ingredients.

I personally find it very annoying when websites use JS to generate large amounts of content rather than to make the page more interactive etc. But what are ya gonna do...

SuperStew
  • 2,857
  • 2
  • 15
  • 27
  • ya I know selenium but is there no other way to scrap a website using requests? – Nimish Bansal May 04 '18 at 13:46
  • Like the answers to the question in the duplicate link, you could try to mimic the requests that JS is doing in requests. But this isn't always possible, as JS isn't always just making a request. Other than that, none that i know of. – SuperStew May 04 '18 at 13:51