I am trying to crawl an authenticated website with this code. I successfully login to the website but when Im trying to send another FormRequest, I am redirected to the login page again. It seems that session/cookies is not kept by scrapy?
In scrapy docs here, so if i send another request the session is not kept? SO what does # continue scraping with authenticated session...
this means at all?
Any Idea? THank you
import scrapy
from scrapy.utils.response import open_in_browser
class LoginSpider(scrapy.Spider):
name = 'login_spider'
start_urls = ['https://example.com/login']
def parse(self, response):
yield scrapy.FormRequest.from_response(
response,
formdata={'username': 'username', 'password': 'password'},
callback=self.after_login
)
def after_login(self, response):
if "Authenticated" in response.body.decode("utf-8"):
# continue scraping with authenticated session...
url = 'https://example.com/search'
yield scrapy.FormRequest(
url,
formdata={'from': '09/24/2017', 'to': '09/25/2017'},
callback=self.parse_something
)
else:
self.logger.error("Login failed")
return
def parse_something(self, response):
open_in_browser(response)
self.logger.error(response.body)
return