1

I have been looking for a solution to get the returned values of the function that executes at a scheduled time. Here is a sample function.

def get_data(dir):
    df = pd.DataFrame()
    file_name = "nofile"
    # The rest of the body of the code executes and updates the two variables df and file_name
    return df, file_name

What I want to do is to schedule this function at 6:00 AM. If the _file_name_ remains "nofile" after the execution of "get_data()" at 6:00 AM, the function should be re-executed at 6:00 PM. So like this

 def sched():
    # Schedule at 6 AM
    # Here I cannot get the file_name as return value of get_data()
    if file_name == "nofile":
        # Schedule at 6 PM
        file_name = "nofile"

I have no idea how to do it with APScheduler. Please help.

alan
  • 3,246
  • 1
  • 32
  • 36
  • It would be helpful if you can provide some of your code written for the APScheduler. [This](https://stackoverflow.com/questions/29223222/how-do-i-schedule-an-interval-job-with-apscheduler) is a good example of someone can ask for help with regards to their question. You can reference their code and see if you can try to get it executing. – Enthus3d Sep 19 '19 at 14:56
  • What do you mean when you say "# Here I cannot get the file_name as return value of get_data()"? Why can't you call `df, file_name = get_data(...)` here? – SyntaxVoid Sep 19 '19 at 15:07

2 Answers2

0

The best solution I can think of is using a global variable. Schedule your function to always run at 18:00 (6 PM) and use the value of the global variable to check if it should do anything.

Alex Grönholm
  • 5,563
  • 29
  • 32
0

So after much thinking and keeping in mind the suggestion of Alex, I sorted this out. Here is the solution to my problem. The function I wanted to schedule looks something like this:

def get_data(dir):
    df = pd.DataFrame()
    file_name = "nofile"
    # The rest of the body of the code executes and updates the two variables df and file_name
    return df, file_name

The solution is:

from apscheduler.schedulers.background import BackgroundScheduler

if __name__ == "__main__":
    sched = BackgroundScheduler()

    @sched.scheduled_job('cron', hour=6, minute=0, second=0, timezone='Europe/Paris')
    def run_db1():
        download_directory = "D:/random_data"
        if not os.path.exists(download_directory):
            download_directory = os.mkdir(download_directory)

        df, flnom = get_data(download_directory)
        print("1st check")
        if flnom == 'nofile':
            @sched.scheduled_job('cron', hour=18, minute=0, second=0, timezone='Europe/Paris')
            def run_db2():
                download_directory = "D:/random_data"
                if not os.path.exists(download_directory):
                    download_directory = os.mkdir(download_directory)
                df, flnom = get_data(download_directory)
                print("2nd check")
                flnom = "nofile"

                return df, flnom

        return df, flnom

    sched.start()

My original problem is to download data from ftp server if a new file is uploaded on the server TODAY. IF the file exists, then file_name will be updates; otherwise it will remain as "nofile" and will be checked again later the same day. Thanks all who contributed.