I am writing a backup script for a sqlite database that changes very intermittently. Here's how it is now:
from bz2 import BZ2File
from datetime import datetime
from os.path import dirname, abspath, join
from hashlib import sha512
def backup_target_database(target):
backup_dir = dirname(abspath(target))
hash_file = join(backup_dir, 'last_hash')
new_hash = sha512(open(target, 'rb').read()).digest()
if new_hash != open(hash_file, 'rb').read():
fmt = '%Y%m%d-%H%M.sqlite3.bz2'
snapshot_file = join(backup_dir, datetime.now().strftime(fmt))
BZ2File(snapshot_file, 'wb').write(open(target, 'rb').read())
open(hash_file, 'wb').write(new_hash)
Currently the database weighs just shy of 20MB, so it's not that taxing when this runs and reads the whole file into memory (and do it twice when changes are detected), but I don't want to wait until this becomes a problem.
What is the proper way to do this sort of (to use Bashscript terminology) stream piping?