2

I am using scrapy to scrape data from a member's only site. I perform login and scrape data successfully.

However, I now need to submit some forms on the site when scraping of data is finished. i.e: following all reading of data, I want to write some data to the site I am scraping (reading) data from.

My question is:

How do I get informed of scrapy finished processing all url scraping, so I can perform some form submissions?

I noticed a solution - see here (scrapy: Call a function when a spider quits) but for some reason I cannot continue yielding more Requests in the self.spider_closed method as it is called over on those examples so I can do some write operations.

Community
  • 1
  • 1
Ami
  • 495
  • 1
  • 4
  • 13

1 Answers1

2

Yes, you cannot continue using the spider after spider_closed signal has been fired - it is too late, spider is already closed at the moment.

A better signal to use would be spider_idle:

Sent when a spider has gone idle, which means the spider has no further:

  • requests waiting to be downloaded
  • requests scheduled
  • items being processed in the item pipeline
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195