-1
import scrapy
from scrapy.crawler import CrawlerRunner

class Livescores2(scrapy.Spider):
    
    name = 'Home'

    def start_requests(self):
        yield scrapy.Request('https://www.livescores.com/football/turkey/super-lig/?tz=3&table=league-home')

    def parse(self, response):

        for total in response.css('td'):
            yield{
                'total': total.css('::text').get()               
                }

                  

runner2 = CrawlerRunner()
runner2.crawl(Livescores2) 

When i adjust settings like below, i can save the data as json without a problem.

runner2 = CrawlerRunner(settings = {
    "FEEDS": {
    "Home.json": {"format": "json", "overwrite": True},
    },
    })

I want to assign the returned Scrapy data to a Variable so i can work on it. I don't want any Json data!

I tried:

import scrapy
from scrapy.crawler import CrawlerRunner

class Livescores2(scrapy.Spider):
    
    name = 'Home'

    def start_requests(self):
        yield scrapy.Request('https://www.livescores.com/football/turkey/super-lig/?tz=3&table=league-home')

    def parse(self, response):

        for total in response.css('td'):
            yield{
                'total': total.css('::text').get()               
                }

                  

runner2 = CrawlerRunner()
a = runner2.crawl(Livescores2) 

print(a)

Result is: <Deferred at 0x65cbfb6d0>

How can i reach the data from a variable? I develop a Android app so i don't need any Json file. I don't know how to use "return" function on this code.

Thanks very much

  • See [this](https://stackoverflow.com/questions/70564312/python-scrapy-how-to-pass-the-response-to-the-main-function-from-the-spider/70566579#70566579). – SuperUser Sep 12 '22 at 16:55
  • I reviewed it. I have a code [link](https://pastecode.io/s/mnrwwui8) It has loop system and 3 spoders so i could not fix the problem with that solunation. I am a newbei in Python so should i use beautifulsoup or mechanicalsoup for this? Scrap always exports json file. Is there a basic metod for this? Too complex i think –  Sep 12 '22 at 18:59

1 Answers1

0

You can simply create a class attribute that stores the data, and then access it once the spider has completed processing all of the requests. This isn't really the workflow that the scrapy framework targets though, and there are likely other web-scraping tools that could handle this more intuitively.

for example:

import scrapy
from scrapy.crawler import CrawlerRunner

class Livescores2(scrapy.Spider):
    name = 'Home'
    data = []   # data attribute
    def start_requests(self):
        yield scrapy.Request('https://www.livescores.com/football/turkey/super-lig/?tz=3&table=league-home')

    def parse(self, response):
        for total in response.css('td'):
            item = {'total': total.css('::text').get()}
            self.data.append(item)  # append item to data list
            yield item

runner2 = CrawlerRunner()
a = runner2.crawl(Livescores2) 

print(Livescores2.data)  # print the collected data
Alexander
  • 16,091
  • 5
  • 13
  • 29
  • Thanks very much one more question. When i run this code, no results seen on the terminal until i press CTRT C, otherwise it stays on hold like this C:\Users\Messi\Desktop\Bot>c:/Users/Messi/Desktop/Bot/Scripts/python.exe "c:/Users/Messi/Desktop/Bot/whiskyscraper/whiskyscraper/spiders/Home copy.py" When i press CRTL C all list comes to screen What is the reason of this? –  Sep 13 '22 at 12:15
  • @MecraYavçın That isn't the case when I I run it this way... – Alexander Sep 13 '22 at 16:40