I am trying to scrape the data from the website www.vestiairecollective.com
While scraping I have access to only few of its main pages. For example my script cannot scrape the data for the url http://www.vestiairecollective.com/women-bags/handbags/#_=catalog&id_brand%5B%5D=50&material%5B%5D=3&step=180
.
I have referred many questions of stack overflow which shows how to do it. As I am using python 3.5 on Windows, "mechanize" and "cookielib" doesnt work. I also saw few questions pointing out libraries like "robobrowser" can do the work. I tried with that too and got stuck in the middle.
Then I tried with sessions and when I type with request.Sessions(), it says request doesnt have an attribute called sessions.
Please help me either with robobrowser or any other way with code for this particular website when I use the above mentioned URL.
This is what I have tried after referring the answer:-
import urllib.request
from bs4 import BeautifulSoup
import requests
session=requests.Session()
loginUrl='http://www.vestiairecollective.com/'
resLogin=session.post(loginUrl,data= {'h':'5fcdc0ac04537595a747e2830037cca0','email':'something@gmail.com','password':'somepasswrd','ga_client_id':'750706459.1463098234'})
url='http://www.vestiairecollective.com/women-bags/handbags/#_=catalog&id_brand%5B%5D=50&material%5B%5D=3'
res=session.get(url)
//The below url i have given because I want to scrape from this url
crl=urllib.request.urlopen("http://www.vestiairecollective.com/women-bags/handbags/#_=catalog&id_brand%5B%5D=50&material%5B%5D=3")
soup=BeautifulSoup(crl.read(),"html.parser")
geturl=soup.find_all("div",{"class":"expand-snippet-container"})
for i in geturl: //The Scraping Part
data1=i.find_all("p",{"class":"brand"})
datac1=[da.contents[0] for da in data1]
brdata=("\n".join(datac1))
print(brdata)
Here the scraping should be done from the "crl" page but its doing from the main page itself.