0

I've got a python 3 script i use to backup and encrypt mysqldump files and im having a particular issues with one database that is 67gb after encryption & compression. The mysqldump is outputting errorcode 3, so i'd like to catch the actual error message, as this could mean a couple of things. The random thing is the backup file is the right size, so not sure what the error means. it worked once on this database...

the code looks like the below and i'd really appreciate some help on how to add non-blocking capture of stderr when the return code is anything but 0 for both p1 and p2.

Also, if im doing anything glaringly obvious wrong, please do let me know, as i'd like to make sure this is a reliable process. it has been working fine on my databases under 15gb compressed.

def dbbackup():
    while True:
        item = q.get()
        #build up folder structure, daily, weekly, monthy & project
        genfile = config[item]['DBName'] + '-' + dateyymmdd + '-'
        genfile += config[item]['PubKey'] + '.sql.gpg'
        if os.path.isfile(genfile):
            syslog.syslog(item + ' ' + genfile + ' exists, removing')
            os.remove(genfile)
        syslog.syslog(item + ' will be backed up as ' + genfile)
        args = ['mysqldump', '-u', config[item]['UserNm'],
                '-p' + config[item]['Passwd'], '-P', config[item]['Portnu'],
                '-h', config[item]['Server']]
        args.extend(config[item]['MyParm'].split())
        args.append(config[item]['DBName'])
        p1 = subprocess.Popen(args, stdout=subprocess.PIPE)
        p2 = subprocess.Popen(['gpg', '-o', genfile, '-r',
                               config[item]['PubKey'], '-z', '9', '--encrypt'], stdin=p1.stdout)
        p2.wait()
        if p2.returncode == 0:
            syslog.syslog(item + ' encryption successful')
        else:
            syslog.syslog(syslog.LOG_CRIT, item + ' encryption failed '+str(p2.returncode))
            p1.terminate()
        p1.wait()
        if p1.returncode == 0:
        #does some uploads of the file etc..
        else:
            syslog.syslog(syslog.LOG_CRIT, item + ' extract failed '+str(p1.returncode))
        q.task_done()


def main():
    db2backup = []
    for settingtest in config:
            db2backup.append(settingtest)
    if len(db2backup) >= 1:
        syslog.syslog('Backups started')
        for database in db2backup:
            q.put(database)
            syslog.syslog(database + ' added to backup queue')
        q.join()
        syslog.syslog('Backups finished')


q = queue.Queue()
config = configparser.ConfigParser()
config.read('backup.cfg')
backuptype = 'daily'
dateyymmdd = datetime.datetime.now().strftime('%Y%m%d')


for i in range(2):
    t = threading.Thread(target=dbbackup)
    t.daemon = True
    t.start()

if __name__ == '__main__':
    main()
Alan
  • 3
  • 2
  • why are you waiting on p1 after p2, also p1.stdout should be closed. I also don't see you redirecting stderr anywhere. – Padraic Cunningham May 11 '15 at 09:03
  • the p1.wait is to ensure the p1.returncode is not blocked or causes an error. I've not added stderr yet as i couldn't get it right, so was hoping someone could advise. – Alan May 11 '15 at 09:21
  • *The p1.stdout.close() call after starting the p2 is important in order for p1 to receive a SIGPIPE if p2 exits before p1.* – Padraic Cunningham May 11 '15 at 09:22
  • [http://stackoverflow.com/questions/17889465/python-subprocess-and-mysqldump] - im taking the method used in the top answer, the wait should be doing a close, but interested to hear how this relates from that line you copied from the python documentation. – Alan May 11 '15 at 10:11
  • related: [Non-blocking read on a subprocess.PIPE in python](http://stackoverflow.com/q/375427/4279) – jfs May 16 '15 at 19:07
  • see also [How do I use subprocess.Popen to connect multiple processes by pipes?](http://stackoverflow.com/q/295459/4279) – jfs May 16 '15 at 19:12

1 Answers1

0

Simplify your code:

  • avoid unnecessary globals, pass parameters to the corresponding functions instead
  • avoid reimplementing a thread pool (it hurts readability and it misses convience features accumulated over the years).

The simplest way to capture stderr is to use stderr=PIPE and .communicate() (blocking call):

#!/usr/bin/env python3
from configparser import ConfigParser
from datetime import datetime
from multiprocessing.dummy import Pool
from subprocess import Popen, PIPE

def backup_db(item, conf): # config[item] == conf
    """Run `mysqldump ... | gpg ...` command."""
    genfile = '{conf[DBName]}-{now:%Y%m%d}-{conf[PubKey]}.sql.gpg'.format(
                conf=conf, now=datetime.now())
    # ...
    args = ['mysqldump', '-u', conf['UserNm'], ...]
    with Popen(['gpg', ...], stdin=PIPE) as gpg, \
         Popen(args, stdout=gpg.stdin, stderr=PIPE) as db_dump:
        gpg.communicate() 
        error = db_dump.communicate()[1]
    if gpg.returncode or db_dump.returncode:
        error

def main():
    config = ConfigParser()
    with open('backup.cfg') as file: # raise exception if config is unavailable
        config.read_file(file)
    with Pool(2) as pool:
        pool.starmap(backup_db, config.items())

if __name__ == "__main__":
    main()

NOTE: no need to call db_dump.terminate() if gpg dies prematurely: mysqldump dies when it tries to write something to the closed gpg.stdin.

If there are huge number of items in the config then you could use pool.imap() instead of pool.starmap() (the call should be modified slightly).

For robustness, wrap backup_db() function to catch and log all exceptions.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670