3

I had multiple spiders in my project so decided to run them by uploading to scrapyd server. I had uploaded my project succesfully and i can see all the spiders when i run the command

curl http://localhost:6800/listspiders.json?project=myproject

when i run the following command

curl http://localhost:6800/schedule.json -d project=myproject -d spider=spider2

Only one spider runs because of only one spider given, but i want to run run multiple spiders here so the following command is right for running multiple spiders in scrapyd ?

curl http://localhost:6800/schedule.json -d project=myproject -d spider=spider1,spider2,spider3........

And later i will run this command using cron job i mean i will schedule this to run frequently

Shiva Krishna Bavandla
  • 25,548
  • 75
  • 193
  • 313

1 Answers1

2

If you want to run multiple spiders using scrapyd, schedule them one by one. scrapyd will run them in the same order but not at the same time.

See also: Scrapy 's Scrapyd too slow with scheduling spiders

Community
  • 1
  • 1
warvariuc
  • 57,116
  • 41
  • 173
  • 227
  • Yes i mean to run all the spiders with one command, not all concurrently. After deploying a project with multiple spiders how can i schedule them with using scrapyd, whether above command is useful ? – Shiva Krishna Bavandla Jul 09 '12 at 08:37
  • Your command is invalid. http://doc.scrapy.org/en/latest/topics/scrapyd.html#schedule-json says that `spider` argument should contain spider name, but you provided a list of spider names delimited by commas. Instead of doing `http://localhost:6800/schedule.json -d project=myproject -d spider=spider1,spider2` do `http://localhost:6800/schedule.json -d project=myproject -d spider=spider1` then `http://localhost:6800/schedule.json -d project=myproject -d spider=spider2`, and so on – warvariuc Jul 09 '12 at 09:25
  • if we do so i expect this will be same as "scrapy crawl spider_name" command, then why we uploaded this to scrapyd server, suppose if want to run all these through cron jobs i need to write all the commands in more than one lines right? – Shiva Krishna Bavandla Jul 09 '12 at 10:36
  • 2
    Actually, when scrapyd runs a spider it uses almost the [same command](http://doc.scrapy.org/en/latest/topics/scrapyd.html#how-scrapyd-works) as "scrapy crawl spider_name" – warvariuc Jul 09 '12 at 12:38
  • oh thanks warwaruk, but how can we run the spiders one by one then, because now i am trying to run all the spiders(for example 4 spiders) through cron jobs, isthere any way to run all the spiders one by one and schedule them to run for every 2 or more hours – Shiva Krishna Bavandla Jul 09 '12 at 12:50
  • 1
    make a bash script which issues commands for scheduling all your spiders, and put that push script to cron. Alternatively, it can be a python script which does the same, without calling curl. An example is [here](http://stackoverflow.com/questions/10801093/run-multiple-scrapy-spiders-at-once-using-scrapyd) – warvariuc Jul 09 '12 at 13:00