2

I have a fairly light script that I want to run periodically in the background every 5 hours or so. The script runs through a few different websites, scans them for new material, and either grabs .mp3 files from them or likes songs on youtube based on their content. There are a few things I want to achieve with this program that I am unsure of how to attain:

  • Have the program run every 5 hours -- I'm not to familiar with system-level timing operations.
  • Have the program efficiently run in the background -- I want these 'updates' to occur without the user knowing.
  • Have the program activate on startup -- I know how I would set this up as a user, but I'm not sure how to add such a configuration to the python file, if that's even possible. Keep in mind that this is going to be a simple .py script -- I'm not compiling it into an executable.

The program is designed mainly with OSX and other Unix based systems in mind. Any advice on achieving some of these goals?

Oerd
  • 2,256
  • 1
  • 21
  • 35
user1427661
  • 11,158
  • 28
  • 90
  • 132

2 Answers2

4

If your script doesn't need to be constantly in execution, and it sounds like it doesn't, I'd suggest you set up a cron job.

On a typical Linux box, you can edit your crontab file via:

$ crontab -e

This will open your crontab in your standard editor and install the crontab file for you after you're done (i.e. you save it)

A typical crontab command looks like:

# m h  dom mon dow   command
 15  0   *   *   *    /bin/bash /home/yourself/bin/dump_my_tables.sh

this line will execute dump_my_tables.sh every day at 00:15. Your script will need a like like the following:

0  */5  *  *  *   /usr/bin/python /home/yourself/bin/scrape_the_web.py

nb:

  • Time is the machine's local time (!)
  • Some cron versions don't accept */5 syntax, you have to manually specify the hours when you want the script to be scheduled, i.e. 0,5,10,15,20
  • You might want to redirect output, but it's out of this answers' scope
Oerd
  • 2,256
  • 1
  • 21
  • 35
2

Have the program run every 5 hours -- I'm not to familiar with system-level timing operations.

for nix cron is the default solution to accomplish this

Have the program efficiently run in the background -- I want these 'updates' to occur without the user knowing.

Using cron the program will be run in the background on your server. The user shouldn't be adversly affected by it. If the user loads a page viewing mp3s you have scraped. Then in the midst of your script running/saving data to the database the user hits refresh, the new mp3's might show up, i don't know if this is what you had in mind by "without the user knowing"

Have the program activate on startup -- I know how I would set this up as a user, but I'm not sure how to add such a configuration to the python file, if that's even possible. Keep in mind that this is going to be a simple .py script -- I'm not compiling it into an executable.

I'm pretty sure cron entries will persist at reboot, (i'm not 100%), make sure that cron daemon is started on boot

dm03514
  • 54,664
  • 18
  • 108
  • 145
  • Cron entries *do* persist at reboot and cron is also "auto-started". Our answers are complementary since I focused just on the cron part :) +1 – Oerd Jan 30 '13 at 22:39