0

I have to fetch data from the following webpage.

https://www.snapdeal.com/product/skycandle-purple-magic-mop/624744850271#bcrumbSearch:magic%20mop

I have also attached a screenshot of the page. My goal is to fetch the reviews entered by the buyer's in the "Customer Reviews" section that can be found by scrolling down a little.

Screenshot of the 'Customer Reviews' section

import urllib.request

wiki = "https://www.snapdeal.com/product/skycandle-purple-magic-mop/624744850271#bcrumbSearch:magic%20mop"

page = urllib.request.urlopen(wiki)

from bs4 import BeautifulSoup

soup = BeautifulSoup(page)

#for printing comments 
comm = soup.find_all("div", {"class" : "reviewareain clearfix"})

print (comm)

But I do not get any output when I run this program. The class name and tags that i have mentioned, I used 'inspect element' on my chrome browser to find out the same. I guess I have selected the wrong class-name due to the multiple nested <div> tags in the structure of the html
I am new to python and so, a simple and comprehensive answer will be appreciated. Also, please suggest some good online material to study beautifulsoup, apart from the official documentation.

Mr Lister
  • 45,515
  • 15
  • 108
  • 150
Shantanu
  • 23
  • 5
  • This is dynamic content, loaded by javascript and cannot be extracted with urllib. You can test what is dynamic by disabling javascript in your browser. You have to use something like selenium instead. – Teemu Risikko Feb 24 '17 at 12:09
  • Possible duplicate of [Web-scraping JavaScript page with Python](http://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python) – Teemu Risikko Feb 24 '17 at 12:09
  • Thank you! Can you provide any good tutorial pertaining to my area of interest i.e. web scraping? – Shantanu Feb 24 '17 at 12:44

0 Answers0