I have a scrapy crawler on an elastic beanstalk app that I can run by SSH like this:
source /opt/python/run/venv/bin/activate
source /opt/python/current/env
cd /opt/python/current/app
scrapy crawl spidername
I want to set up a cronjob to run this for me. So I followed the suggestions here.
My setup.config
file looks like this:
container_commands:
01_cron_hemnet:
command: "cat .ebextensions/spider_cron.txt > /etc/cron.d/crawl_spidername && chmod 644 /etc/cron.d/crawl_spidername"
leader_only: true
My spider_cron.txt
file looks like this:
# The newline at the end of this file is extremely important. Cron won't run without it.
* * * * * root sh /opt/python/current/app/runcrawler.sh &>/tmp/mycommand.log
# There is a newline here.
My runcrawler.sh
file is located at /opt/python/current/app/runcrawler.sh
and looks like this
#!/bin/bash
cd /opt/python/current/app/
PATH=$PATH:/usr/local/bin
export PATH
scrapy crawl spidername
I can navigate to /etc/cron.d/
and see that crawl_spidername
exists there. But when I run crontab -l
or crontab -u root -l
it says that no crontab exists.
I get no log errors, no deployment errors and the /tmp/mycommand.log
file that I try to output the cron to is never created. Seems like the cronjob is never started.
Ideas?