Can't stop Scrapy from inside pipeline.py

Question

So I am writing a validator for my Scrapy data and want the spider to stop crawling if the data is in an incorrect format. I am doing this in Pipeline.py.

I have already tried calling CloseSpider, close_spider and crawler._signal_shutdown(9,0) (which have been used in other tutorials but for some reason don't work in pipeline.py). I am aware that the spider does not finish straight away but all the above methods seem to yield some sort of error. Is there just a straight forward way to kill the crawler?

if you get error then why you didn't show it ? always put full error message (starting at word "Traceback") in question (not comment) as text (not screenshot). There are other useful information. — furas, Jul 30 '19 at 09:23

score 2 · Answer 1 · answered Jul 30 '19 at 13:11

Your scraper still working because of its schedule some amount of request and CloseSpider was created for a graceful shutdown. It means that all request that is in progress will be canceled or done before crawler will be closed. Do you call close_spider() in this way

score 0 · Answer 2 · answered Jul 30 '19 at 12:50

0

Just try below code to kill the process of spider:

raise SystemExit

answered Jul 30 '19 at 12:50

0x01h

843
7
13

Can't stop Scrapy from inside pipeline.py

2 Answers2