1

I have scrapy and scrapyd installed on a debian machine. I log in to this server using a ssh-tunnel. I then start scrapyd by going: scrapyd

Scrapyd starts up fine and I then open up another ssh-tunnel to the server and schedule my spider with: curl localhost:6800/schedule.json -d project=myproject -d spider=myspider

The spider runs nicely and everything is fine.

The problem is that scrapyd stops running when I quit the session where I started up scrapyd. This prevents me from using cron to schdedule spiders with scrapyd since scrapyd isn't running when the cronjob is launched.

My simple question is: How do I keep scrapyd running so that it doesn't shut down when I quit the ssh session.

user1009453
  • 707
  • 2
  • 11
  • 28

3 Answers3

1

Run it in a screen session:

$ screen
$ scrapyd

# hit ctrl-a, then d to detach from that screen

$ screen -r # to re-attach to your scrapyd process
user2926055
  • 1,963
  • 11
  • 10
  • scrapyd is designed to run as a daemon. Screen is not appropriate solution. – Frederic Bazin Apr 01 '14 at 12:20
  • I agree here with @FredericBazin. If scrapyd shuts down or crashes it will not restart. This "hack" is ok for the short term, but I would use this method in a production environment. – Sam Texas Nov 12 '15 at 00:15
1

You might consider launching scrapyd with supervisor.

And there is a good .conf script available as a gist here: https://github.com/JallyHe/scrapyd/blob/master/supervisord.conf

Sam Texas
  • 1,245
  • 14
  • 30
0

How about ? $ sudo service scrapyd start

Frederic Bazin
  • 1,530
  • 12
  • 27