I am writing a spider with scrapy in python3, and l just started scrapy not a long time. I was catching the data of a web-site and after some minutes, web site maybe get me the 302
status and redirect to another url to verify me. So l want to save the url to the file.
for example, https://www.test.com/article?id=123
is what I want to request, and then it response me 302
an redirect to https://www.test.com/vrcode
I want to save https://www.test.com/article?id=123
to file, how should I do?
class CatchData(scrapy.Spider):
name = 'test'
allowed_domains = ['test.com']
start_urls = ['test.com/article?id=1',
'test.com/article?id=2',
# ...
]
def parse(self, response):
item = LocationItem()
item['article'] = response.xpath('...')
yield item
I found a answer from How to get the scrapy failure URLs?
but It is an answer at six years ago, I want to know is there more simple way to do this