1

I have been trying to run the code below on spyder which I found on this question:

import scrapy
import scrapy.crawler as crawler
from multiprocessing import Process, Queue
from twisted.internet import reactor

# your spider
class QuotesSpider(scrapy.Spider):
    name = "quotes"
    start_urls = ['http://quotes.toscrape.com/tag/humor/']

    def parse(self, response):
        for quote in response.css('div.quote'):
            print(quote.css('span.text::text').extract_first())


# the wrapper to make it run more times
def run_spider(spider):
    def f(q):
        try:
            runner = crawler.CrawlerRunner()
            deferred = runner.crawl(spider)
            deferred.addBoth(lambda _: reactor.stop())
            reactor.run()
            q.put(None)
        except Exception as e:
            q.put(e)

    q = Queue()
    p = Process(target=f, args=(q,))
    p.start()
    result = q.get()
    p.join()

    if result is not None:
        raise result
        
        
print('first run:')
run_spider(QuotesSpider)

print('\nsecond run:')
run_spider(QuotesSpider)

However, when I run it I get the following error:

AttributeError: Can't pickle local object 'run_spider.<locals>.f'

I have seen that one answer suggested

Had small issue regarding 'AttributeError: Can't pickle local object 'run_spider.<locals>.f', but moving function called f outside resolved my issue, and I could run the code –

I tried doing so by placing the function f outside of the run_spider function or even in a different file. But still not working.

Any help would be appreciated. Thank you

colla
  • 717
  • 1
  • 10
  • 22

1 Answers1

1

I tried doing so by placing the function f outside of the run_spider function or even in a different file. But still not working.

You were close

import scrapy
import scrapy.crawler as crawler
from multiprocessing import Process, Queue
from twisted.internet import reactor


# your spider
class QuotesSpider(scrapy.Spider):
    name = "quotes"
    start_urls = ['http://quotes.toscrape.com/tag/humor/']

    def parse(self, response):
        for quote in response.css('div.quote'):
            print(quote.css('span.text::text').extract_first())


def f(q, spider):
    try:
        runner = crawler.CrawlerRunner()
        deferred = runner.crawl(spider)
        deferred.addBoth(lambda _: reactor.stop())
        reactor.run()
        q.put(None)
    except Exception as e:
        q.put(e)

    return q


# the wrapper to make it run more times
def run_spider(spider):
    q = Queue()
    p = Process(target=f, args=(q, spider))
    p.start()
    result = q.get()
    p.join()

    if result is not None:
        raise result


if __name__ == "__main__":
    print('first run:')
    run_spider(QuotesSpider)

    print('\nsecond run:')
    run_spider(QuotesSpider)

Output:

first run:
“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”
“A day without sunshine is like, you know, night.”
“Anyone who thinks sitting in church can make you a Christian must also think that sitting in a garage can make you a car.”
“Beauty is in the eye of the beholder and it may be necessary from time to time to give a stupid or misinformed beholder a black eye.”
“All you need is love. But a little chocolate now and then doesn't hurt.”
“Remember, we're madly in love, so it's all right to kiss me anytime you feel like it.”
“Some people never go crazy. What truly horrible lives they must lead.”
“The trouble with having an open mind, of course, is that people will insist on coming along and trying to put things in it.”
“Think left and think right and think low and think high. Oh, the thinks you can think up if only you try!”
“The reason I talk to myself is because I’m the only one whose answers I accept.”

second run:
“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”
“A day without sunshine is like, you know, night.”
“Anyone who thinks sitting in church can make you a Christian must also think that sitting in a garage can make you a car.”
“Beauty is in the eye of the beholder and it may be necessary from time to time to give a stupid or misinformed beholder a black eye.”
“All you need is love. But a little chocolate now and then doesn't hurt.”
“Remember, we're madly in love, so it's all right to kiss me anytime you feel like it.”
“Some people never go crazy. What truly horrible lives they must lead.”
“The trouble with having an open mind, of course, is that people will insist on coming along and trying to put things in it.”
“Think left and think right and think low and think high. Oh, the thinks you can think up if only you try!”
“The reason I talk to myself is because I’m the only one whose answers I accept.”
SuperUser
  • 4,527
  • 1
  • 5
  • 24
  • Thank you for your answer. Unfortunately I get this error after: AttributeError: Can't get attribute 'f' on – colla Jul 28 '21 at 11:58
  • @colla Did you copy the exact code?, Are you using Jupyter? – SuperUser Jul 28 '21 at 13:50
  • No I am using Spyder and yes I copied the exact code – colla Jul 28 '21 at 15:07
  • @colla I tried to run it in spyder and although I don't get an error it doesn't work as expected. I'm not sure what's the problem but it does work for me on a different IDE. – SuperUser Jul 28 '21 at 16:37