0

I have a list of websites I need to extract certain values from in order to keep a local .txt file up-to-date. Since the websites need to be checked at different time intervals I would prefer not to use Windows Task Manager but instead have a single script running continuously in the background, extracting the information from each website at each specified frequency (so the frequency for each website would be an input parameter) and keep the file updated.

I know how to extract the information from the websites but I don't know how to schedule the checks on the websites in an automated fashion and have the script run continuously in the background. Knowing how to stop it would be useful too. (I have Anaconda Python installed on Windows 7)

What is an efficient way of coding that? Thanks.

PS clarification: The script just needs to run as a background job once started and harvest some text from a number of predefined urls. So my questions are: a) How do I set it to run as a background job? A while loop? Something else? b) How do I make it return to a url to harvest the text at pre-specified intervals?

Hooloovoo
  • 865
  • 2
  • 11
  • 21
  • To have it "hidden" in the background save it as .pyw file. I.e. without console. For the other question you need to describe more in detail what you actually are doing. – NegativeFeedbackLoop Jun 15 '15 at 08:03
  • Hi, I don't need to have it hidden. It just needs to run as a background job once started and harvest some text from a number of predefined urls. So my questions are: a) How do I set it to run as a background job? A while loop? Something else? b) How do I make it return to a url to harvest the text at pre-specified intervals? – Hooloovoo Jun 15 '15 at 09:01
  • 1
    A while loop would suffice, and sleep for a number of seconds. – Prof. Falken Jun 15 '15 at 09:11
  • Ok thanks, is there a more efficient way of doing it? Something like a linux cronjob but run from within the script itself? – Hooloovoo Jun 15 '15 at 09:28
  • 3
    You should implement your code as a Windows service. Run your code on a second worker thread. See http://stackoverflow.com/a/32440/291641 for an example. This way you can manage the background process using Windows service control applet and it wil run without needing any user to log in. – patthoyts Jun 15 '15 at 09:34
  • 1
    Windows has a system task scheduler (like cron) : http://stackoverflow.com/questions/7195503/setting-up-a-cron-job-in-windows. – lucasg Jun 15 '15 at 09:39
  • I want to avoid using Task Scheduler. The reason is that each webpage updates at different times (I know the exact times) and I only want to update the information at those specific times. – Hooloovoo Jun 15 '15 at 11:02
  • Task Scheduler is easily configured to trigger at specific times. Are there other reasons you are avoiding it? – rhashimoto Jun 15 '15 at 19:02

1 Answers1

1

Given that it doesn't need to be a hidden process and that the Windows Task scheduler is unsuitable (as you need to pick different recurrences), it sounds like you just want a simple Python process that will call your function to extract the data on an irregular but predetermined basis.

This sounds a lot like apscheduler (https://pypi.python.org/pypi/APScheduler/) to me. I've used it a lot in Linux and it's worked like a charm for cron-like features. The package docs say it is Cross platform and so might fit the bill.

Peter Brittain
  • 13,489
  • 3
  • 41
  • 57