I have two spiders in my spider.py class, and I want to run them and generate a csv file. Below is the structure of my spider.py
class tmallSpider(scrapy.Spider):
name = 'tspider'
...
class jdSpider(scrapy.Spider):
name = 'jspider'
...
configure_logging()
runner = CrawlerRunner()
@defer.inlineCallbacks
def crawl():
yield runner.crawl(tmallSpider)
yield runner.crawl(jdSpider)
reactor.stop()
crawl()
reactor.run()
Below is the structure for my items.py
class TmallspiderItem(scrapy.Item):
# define the fields for your item here like:
product_name_tmall = scrapy.Field()
product_price_tmall = scrapy.Field()
class JdspiderItem(scrapy.Item):
product_name_jd = scrapy.Field()
product_price_jd = scrapy.Field()
I want to generate a csv file with four columns:
product_name_tmall | product_price_tmall | product_name_jd | product_price_jd
I did scrapy crawl -o prices.csv
in pycharm's terminal but nothing is generated.
I scrolled up and find out only the jd items
are printed in terminal, I do not see any tmall items
printed.
However, if I add a open_in_browser
command for the tmall spider, the brower DOES open. I guess the code was executed, but somehow the data is not recorded?
If I run scrapy crawl tspider
and scrapy crawl jspider
individually, everything is correct and the csv file is generated.
Is this a problem with how I ran the program or is there a problem with my code? Any ideas how to fix it?