0

I have a python webscraping program which needs to be scrapped continuously after the program is terminated. my technique is as follows

crontab -e (settings)

* * * * * /home/ahmed/Desktop/run.sh

run.sh

    TMP_FILE=/tmp/i_am_running
    [ -f $TMP_FILE ] && exit
    touch $TMP_FILE
    /usr/bin/python /home/ahmed/Desktop/python.py
    rm $TMP_FILE

The bash code must have some problem or may be my command in the crontab is wrong. the program is not running. Please guide


After Mark suggestions I modified the script like this

#!/bin/bash
PATH=$PATH:/bin:/usr/bin

date +'%H:%M:%S Started' >> /home/ahmed/Desktop/log.txt

TMP_FILE=/tmp/i_am_running
[ -f $TMP_FILE ] && exit
touch $TMP_FILE

date +'%H:%M:%S Starting Python' >> /home/ahmed/Desktop/log.txt
/usr/bin/python /home/ahmed/Desktop/python.py
rm $TMP_FILE

date +'%H:%M:%S Ended' >> /home/ahmed/Desktop/log.txt

The cron command i am using is * * * * * /home/ahmed/Desktop/run.sh

the log file which is created is this

15:21:01 Started
15:21:02 Starting Python
15:22:02 Started
15:23:01 Started
15:24:01 Started
15:24:30 Ended
15:25:01 Started
15:25:01 Starting Python
15:26:01 Started
15:27:18 Started
15:28:01 Started
15:29:01 Started
15:30:01 Started
15:31:01 Started
15:31:16 Ended
15:32:01 Started
15:32:01 Starting Python
15:33:01 Started
15:34:01 Started

It seems like the program is restarted before its ended. the log file should have starting program, started, ended, starting program, started, ended and so on.

Can someone guide me please?

user3265370
  • 121
  • 1
  • 2
  • 12

1 Answers1

2

Have you made your script executable?

chmod +x /home/ahmed/Desktop/run.sh

Put a proper shebang and PATH in your script so it starts like this:

 #!/bin/bash
 PATH=$PATH:/bin:/usr/bin

Try your script on its own from the command line

/home/ahmed/Desktop/run.sh

If that doesn't work, change the shebang line to add -xv at the end

#!/bin/bash -xv 

Check to see if /tmp/i_am_running exists

Check your cron log

grep CRON /var/log/syslog

Consider changing your script so you can see when it started and/or if it actually ran your python:

#!/bin/bash
PATH=$PATH:/bin:/usr/bin

date +'%H:%M:%S Started' >> /home/ahmed/Desktop/log.txt

TMP_FILE=/tmp/i_am_running
[ -f $TMP_FILE ] && exit
touch $TMP_FILE

date +'%H:%M:%S Starting Python' >> /home/ahmed/Desktop/log.txt
/usr/bin/python /home/ahmed/Desktop/python.py
rm $TMP_FILE

date +'%H:%M:%S Ended' >> /home/ahmed/Desktop/log.txt

By the way, I am not sure how running once at 18:01 constitutes "continuous scraping"?

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • i will change the timing to * * * * * . the timing was just for testing purpose. let me follow the steps you mentioned. thanks – user3265370 Mar 24 '14 at 11:08
  • I changed the code as u mentioned. and when i type grep CRON /var/log/syslog in the command line, it says binary file matches – user3265370 Mar 24 '14 at 11:17
  • "cd /var/log" and see if there is a cron.log or other logfile that your cron has been configured to use. – Mark Setchell Mar 24 '14 at 11:21
  • and i think the cron is working too because in the python its suppose to save data in the database and its saving. to be honest, i have no idea how the bash program is running and accomplishing the task. can you help me finding out how can i see log files. i want to learn the method please – user3265370 Mar 24 '14 at 11:24
  • Get the script running correctly first by checking the `log.txt` on your Desktop and seeing if the time is in there. Once the script works, then introduce it to cron. – Mark Setchell Mar 24 '14 at 11:29
  • its fetching data!! i am excited. my concern is that what if the python program crashes? and can i do this in the server so that the program runs continuously? – user3265370 Mar 24 '14 at 11:33
  • Have a look at the accepted answer for this question: http://stackoverflow.com/questions/696839/how-do-i-write-a-bash-script-to-restart-a-process-if-it-dies – Mark Setchell Mar 24 '14 at 11:42
  • It seems like the program is restarted before its ended. the log file should have starting program, started, ended, starting program, started, ended and so on. – user3265370 Mar 24 '14 at 12:26
  • It looks correct. Sometimes cron starts the job before the last one has finished, so it says "Started" but then it doesn't say "Starting Python" because Python is already running and the script exits before it says "Starting Python". – Mark Setchell Mar 24 '14 at 13:01