start_request
uses the yield functionality. yield queues the requests. To understand it fully read this StackOverflow answer.
Here is the code example of how it works with start_urls
in the start_request
method.
start_urls = [
"url1.com",
"url2.com",
]
def start_requests(self):
for u in self.start_urls:
yield scrapy.Request(u, callback=self.parse)
For custom request ordering this priority feature can be used.
def start_requests(self):
yield scrapy.Request(self.start_urls[0], callback=self.parse)
yield scrapy.Request(self.start_urls[1], callback=self.parse, priority=1)
the one with the higher number of priority will be yielded first from the queue. By default, priority is 0.