0
class PythonEventsSpider(scrapy.Spider):
    name = 'goodspider'
    start_urls=['https://www.amazon.com/s?me=A33IZBYF4IBZTP&marketplaceID=ATVPDKIKX0DER']
    details=[]

    def parse(self, response):
        base_url="https://www.amazon.com"
        #code here
        next_page=base_url+response.xpath('//li[@class="a-last"]/a/@href').extract_first()
        print(next_page)
        if "page=3" not in next_page:
            yield scrapy.Request(url=next_page,callback=self.parse)
        else:
            #raise CloseSpider('bandwidth_exceeded')
            #exit("Done")

Hello,i would like to stop the program when it reaches page 3 the url will be as follows https://www.amazon.com/s?i=merchant-items&me=A33IZBYF4IBZTP&page=3&marketplaceID=ATVPDKIKX0DER&qid=1555628764&ref=sr_pg_3 I Have tried some of the answers online but it didn't work the program kept run. what i want is to add a line or a function in the elsestatement to end scrapy runspider test.py -o test.csv

Granitosaurus
  • 20,530
  • 5
  • 57
  • 82
hadesfv
  • 386
  • 4
  • 18
  • The documentation points at raising `CloseSpider`. What is the exact behaviour you see when you comment your `raise CloseSpider` line back in? https://docs.scrapy.org/en/latest/topics/exceptions.html#scrapy.exceptions.CloseSpider – Adam Burke Apr 19 '19 at 01:20
  • See also https://stackoverflow.com/questions/27001586/scrapy-not-responding-to-closespider-exception and https://stackoverflow.com/questions/44566184/scrapy-spider-not-terminating-with-use-of-closespider-extension – Adam Burke Apr 19 '19 at 01:49

2 Answers2

0

CloseSpider will process all the pending requests too

So you must have to set CONCURRENT_REQUESTS=1

Umair Ayub
  • 19,358
  • 14
  • 72
  • 146
0

If you really want your script to completely stop at that point, you can terminate your script as you would do for any other Python script: use sys.exit().

However, this means that item processing and other parts of the internal workins of Scrapy won’t have a chance to run. If this is a problem for you, there is no other way beyond Umair’s response.

Gallaecio
  • 3,620
  • 2
  • 25
  • 64