I have 400 pages to crawl,for example.And some pages may return 3xx or 4xx .I wish when the numbers of bad requests arrived 100,for example. scrapy task auto stop.Thks~
Asked
Active
Viewed 73 times
1 Answers
1
You can use different systems:
- A global variable in the class (which is not recommended but probably is the simplest solution)
- Storing it in the DB using pipelines
Once you have reached the number that you have configured, you can stop the crawler using:
if errors > maxNumberErrors:
raise CloseSpider('message error')
or (from this answer)
from scrapy.project import crawler
crawler._signal_shutdown(9,0)

ferran87
- 635
- 1
- 6
- 17
-
You can do it in the crawler class for instance. But this is the "easy" and not best solution. – ferran87 Jan 14 '20 at 15:05