I have the following Spider which basically requests the start_urls and for every URL in there it has to do many sub Requests.
def parse(self, response):
print(response.request.headers['User-Agent'])
for info in response.css('div.infolist'):
item = MasterdataScraperItem()
info_url = BASE_URL + info.css('a::attr(href)').get() # URL to subpage
print('Subpage: ' + info_url)
item['name'] = info.css('img::attr(alt)').get()
yield scrapy.Request(info_url, callback=self.parse_info, meta={'item': item})
The for loop in the code above runs around 200 times and after around 100 iterations I get the HTTP Code 429.
My idea was to set DOWNLOAD_DELAY to 3.0 but this somehow has not applied to the loop and scrapy.Request is just called directly a few hundred times.
Is there a way to wait n-seconds before the the next iteration of scrapy.Requests is called?