I'm trying to scrap the titles of all the products listed on a webpage of an E-Commerce site(in this case, Flipkart). Now, the products that I would be scraping would depend of the keyword entered by the user. A typical URL generated if I entered a product 'XYZXYZ' would be:
http://www.flipkart.com/search?q=XYXXYZ&as=off&as-show=on&otracker=start
Now, using this link as a template, I wrote the following script to scrap the titles of all the products listed under any given webpage based on the keyword entered:
import requests
from bs4 import BeautifulSoup
def flipp(k):
url = "http://www.flipkart.com/search?q=" + str(k) + "&as=off&as-show=on&otracker=start"
ss = requests.get(url)
src = ss.text
obj = BeautifulSoup(src)
for e in obj.findAll("a", {'class' : 'lu-title'}):
title = e.string
print unicode(title)
h = raw_input("Enter a keyword:")
print flipp(h)
However, the above script returns None
as the output. When I tried to debug at each step, I found that the requests
module is unable to get the source code of the webpage. What seems to be happening over here?