0

basically i run a shop where we once a day update an excel file that gets external data and then we send the updated files via email to a group of people. We do this with quite a lot of reports, so i want to write a script that does this automatically.

The external data comes once a day and sometimes it comes at 1 in the morning, sometimes early in the morning, sometimes later - but mainly it comes during the night/early morning, so when i get to work the external dataset should be updated.

My question is:

Getting the excel file loaded and send via email seems pretty straight forward but i dont want the email to be send if lets say the external data set has not been updated.

How do i compare the day before dataset with todays dataset, without saving yesterdays dataset as a seperate file on my computer, as this would build up to quite alot of files.

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
  • you can save the previous day's file to your computer, but then later delete it. what OS are you on? the cross-platform `schedule` module might be usable to get your Python script scheduled, though the way it would be triggered is OS-dependent. – mechanical_meat Dec 27 '21 at 15:13
  • 1
    you could mp5-checksum the file every day and check if the new file has a different sum next morning => then send https://stackoverflow.com/questions/3431825/generating-an-md5-checksum-of-a-file – seldomspeechless Dec 27 '21 at 15:14
  • Please provide enough code so others can better understand or reproduce the problem. – Community Jan 04 '22 at 13:12

1 Answers1

0

Based on this link Generating an MD5 checksum of a file

Code could look something like this:

import hashlib
from os.path import exists


def get_oldsum():
    global last_checksum
    if exists("lastsum"):
        with open("lastsum","r") as file:
            last_checksum = file.readline()
    else:
        last_checksum = ""


def save_newsum(newsum):
    with open("lastsum", "w") as file:
        file.write(newsum)


def make_new_hash(file):
    with open(file,"rb") as file:
        file_hash = hashlib.md5()
        while chunk := file.read(8192):
            file_hash.update(chunk)
    return file_hash.hexdigest()


if __name__ == "__main__":
    get_oldsum()
    new_hash = make_new_hash("shared_reports.xlsx")
    print("This MD5 Hash:",new_hash)
    if new_hash != last_checksum:
        print("New Hash! We should trigger our mail-function asap...")
    save_newsum(new_hash)