I was trying to scrape Flipkart website to get the product ids. I used this link for listing all the products. The product url holds the PID. So I was trying to get the url, and following is my code.
>>> from bs4 import BeautifulSoup
>>> import requests
>>> url = "https://www.flipkart.com/search?q=samsung%20mobiles&otracker=start&as-show=on&as=off"
>>> data = requests.get(url, headers={
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36"
}).content
>>> soup = BeautifulSoup(data, "lxml")
>>> soup.find_all('div', 'col zZCdz4')
[]
But it was returning an empty list. So I tried using their API to get the product ID, following is the code:
>>> import requests
>>> headers = ({"x-user-agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.92 Safari/537.36 FKUA/website/41/website/Desktop"})
>>> data = requests.get("https://affiliate-api.flipkart.net/affiliate/1.0/booksApi/jerilwork.json", headers=headers)
It returned me some details in json format, and the following is a single item of the dictionary:
{"name":"Cursive Writing","url":"https://affiliate-api.flipkart.net/affiliate/1.0/booksFeeds/jerilwork/category/bks-fnf-fs6-mak-8lf.json?expiresAt=1479434177786&sig=4710ea4a9633e4e046938c7d47cf53b7","id":"8lf","subCategories":[]}
In their API Documentation it is mentioned that the above mentioned URL "url":"https://affiliate-api.flipkart.net/affiliate/1.0/booksFeeds/jerilwork/category/bks-fnf-fs6-mak-8lf.json?expiresAt=1479434177786&sig=4710ea4a9633e4e046938c7d47cf53b7"
can be used to get the product ID and I tried but its returning me a empty list.
Can someone help me with this. Where am I doing the mistake. How can I get the product ID's of any category (eg. mobile phones or samsung mobile phones). Kindly help.