I am using a ScrapingHub API, and am using shub, to deploy my project. However, the items result is in as shown:
Unfortunately, I need it in the following order --> Title, Publish Date, Description, Link. How can I get the output to be in exactly that order for every item class?
Below is a short sample of my spider:
import scrapy
from scrapy.spiders import XMLFeedSpider
from tickers.items import tickersItem
class Spider(XMLFeedSpider):
name = "Scraper"
allowed_domains = ["yahoo.com"]
start_urls = ('https://feeds.finance.yahoo.com/rss/2.0/headline?s=ABIO,ACFN,AEMD,AEZS,AITB,AJX,AU,AKERMN,AUPH,AVL,AXPW
'https://feeds.finance.yahoo.com/rss/2.0/headline?s=DRIO
'https://feeds.finance.yahoo.com/rss/2.0/headline?s=IDXG,IMMU,IMRN,IMUC,INNV,INVT,IPCI,INPX,JAGX,KDMN,KTOV,LQMT
)
itertag = 'item'
def parse_node(self, response, node):
item = {}
item['Title'] = node.xpath('title/text()',).extract_first()
item['Description'] = node.xpath('description/text()').extract_first()
item['Link'] = node.xpath('link/text()').extract_first()
item['PublishDate'] = node.xpath('pubDate/text()').extract_first()
return item
Additionally, here is my attached items.py file, it is in the same order as my spider, so I have no idea why the output is not in order.
Items.py:
import scrapy
class tickersItem(scrapy.Item):
Title = scrapy.Field()
Description = scrapy.Field()
Link = scrapy.Field()
PublishDate = scrapy.Field()
The syntax of my code is in order for both the items and the spider file, and I have no idea how to fix it. I am a new python programmer.