0

I have a scraper that works correctly and can get it into a CSV file easily, but it always returns the values in a weird order.

I checked to make sure the items.py fields were in the right order, and tried moving around the fields in the spider, but I can't figure out why it's yielding them in a weird way.

import scrapy
from scrapy.spiders import CrawlSpider
from scrapy import Selector
from scrapy.loader import ItemLoader
from scrapy.spiders import Rule
from scrapy.linkextractors import LinkExtractor
from sofifa_scraper.items import Player


class FifaInfoScraper(scrapy.Spider):
    name = "player2_scraper"
    start_urls = ["https://www.futhead.com/19/players/?level=all_nif&bin_platform=ps"]


    def parse(self,response):
        for href in response.css("li.list-group-item > div.content > a::attr(href)"):
            yield response.follow(href, callback = self.parse_name)



    def parse_name(self,response):
        item = Player()

        item['name'] = response.css("div[itemprop = 'child'] > span[itemprop = 'title']::text").get() #Get player name

        club_league_nation = response.css("div.col-xs-5 > a::text").getall()    #club, league, nation are all stored under same selectors, so pull them all at once

        item['club'],item['league'],item['nation'] = club_league_nation         #split the selected info from club_league_nation into 3 seperate categories
        yield item

I'd like the scraper to return the player name in the first column, and am not too concerned with the order after that. Player name always ends up in another column though, and happens when I'm only pulling the name and one other value as well.

B. Adams
  • 3
  • 1

1 Answers1

1

Just add FEED_EXPORT_FIELDS in your settings.py (documentation):

FEED_EXPORT_FIELDS = ["name", "club", "league", "nation"]
gangabass
  • 10,607
  • 2
  • 23
  • 35
  • Always on here beating me to them why dont you have a look at this one I havent been able to resolve. https://stackoverflow.com/questions/56816617/issue-using-scrapy-spider-output-in-python-script/56819174?noredirect=1#comment100559190_56819174 – ThePyGuy Jul 13 '19 at 01:51