2

i'm trying to make a script that runs many spiders but i'm getting ImportError: No module named project_name.settings

my script looks like this:

import os
os.system("scrapy crawl spider1")
os.system("scrapy crawl spider2")
....
os.system("scrapy crawl spiderN")

My settings.py

# -*- coding: utf-8 -*-

# Scrapy settings for project_name
#
# For simplicity, this file contains only the most important settings by
# default. All the other settings are documented here:
#
#     http://doc.scrapy.org/en/latest/topics/settings.html
#

BOT_NAME = 'project_name'

ITEM_PIPELINES = {
    'project_name.pipelines.project_namePipelineToJSON': 300,
    'project_name.pipelines.project_namePipelineToDB': 800
}

SPIDER_MODULES = ['project_name.spiders']
NEWSPIDER_MODULE = 'project_name.spiders'

# Crawl responsibly by identifying yourself (and your website) on the user-agent
#USER_AGENT = 'project_name (+http://www.yourdomain.com)'

And my spiders look like any normal spider, quite simple ones actually...

import scrapy
from scrapy.crawler import CrawlerProcess
from Projectname.items import ProjectnameItem

class ProjectnameSpiderClass(scrapy.Spider):
    name = "Projectname"
    allowed_domains = ["Projectname.com"]

    start_urls = ["...urls..."]


    def parse(self, response):
        item = ProjectnameItem()

I gave them generic names but you get the idea, is there a way to solve this error?

Computer's Guy
  • 5,122
  • 8
  • 54
  • 74
  • Please see this relevant [post](http://stackoverflow.com/a/27744766/771848) providing a detailed explanation on how to run Scrapy from script. – alecxe Jun 28 '15 at 19:12
  • @alecxe okay i'm using that template from the GIST, is there a way to run multiple spiders from that script? – Computer's Guy Jun 28 '15 at 19:53

1 Answers1

1

Edit 2018:

You need to run the spider from the project folder, meaning that the os.system("scrapy crawl spider1") has to be run from the folder with the spider1.

Or you can do as I did in the past, putting all the code in a single file (old answer, not recommended by me anymore, but still useful and decent solution)

Well, in case someone comes up to this question I finally used a heavily modified version of this https://gist.github.com/alecxe/fc1527d6d9492b59c610 provided by alexce in another question. Hope this helps.

Computer's Guy
  • 5,122
  • 8
  • 54
  • 74
  • 1
    Thanks for sharing! Could you please make a gist of your modified version? I think the sample I've provided needs an improvement.. – alecxe Jul 22 '15 at 00:20