I am facing a problem with my custom retry middleware in scrapy. I have a project made of 6 spiders, launched by a little script containing a CrawlerProcess(), crawling 6 different websites. They should work simultaneously and here is the problem: i have implemented a retry middleware to send again requests which had not been responsed (status code 429) like that:
if response.status == 429:
self.crawler.engine.pause()
time.sleep(self.RETRY_AFTER.get(spider.name) * (random.random() + 0.5))
self.crawler.engine.unpause()
reason = response_status_message(response.status)
return self._retry(request, reason, spider) or response
But of course, using time.sleep, when I execute all the spiders simultaneously using a CrawlerProcess(), all the spiders are paused (because the process is paused) Is there a better way to pause only one of them, retrying the request after a delay and make the other spiders continue to crawl? Because my application should scrape some sites at the same time, but each spider should send again a request if status code 429 had been received, so how can i replace that time.sleep() with something that stops only the spider entering that method?