You can replace everything that you're doing with Python code, except for your external utility. That way your program will remain portable as long as your external util is portable. You can also consider turning the C++ program into a library and using Cython to interface with it. As Messa showed, date
is replaced with time.strftime
, globbing is done with glob.glob
and cat
can be replaced with reading all the files in the list and writing them to the input of your program. The call to bzip2
can be replaced with the bz2
module, but that will complicate your program because you'd have to read and write simultaneously. To do that, you need to either use p.communicate
or a thread if the data is huge (select.select
would be a better choice but it won't work on Windows).
import sys
import bz2
import glob
import time
import threading
import subprocess
output_filename = '../whatever.bz2'
input_filenames = glob.glob(time.strftime("xyz_%F_*.log"))
p = subprocess.Popen(['filter', 'args'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
output = open(output_filename, 'wb')
output_compressor = bz2.BZ2Compressor()
def data_reader():
for filename in input_filenames:
f = open(filename, 'rb')
p.stdin.writelines(iter(lambda: f.read(8192), ''))
p.stdin.close()
input_thread = threading.Thread(target=data_reader)
input_thread.start()
with output:
for chunk in iter(lambda: p.stdout.read(8192), ''):
output.write(output_compressor.compress(chunk))
output.write(output_compressor.flush())
input_thread.join()
p.wait()
Addition: How to detect file input type
You can use either the file extension or the Python bindings for libmagic to detect how the file is compressed. Here's a code example that does both, and automatically chooses magic
if it is available. You can take the part that suits your needs and adapt it to your needs. The open_autodecompress
should detect the mime encoding and open the file with the appropriate decompressor if it is available.
import os
import gzip
import bz2
try:
import magic
except ImportError:
has_magic = False
else:
has_magic = True
mime_openers = {
'application/x-bzip2': bz2.BZ2File,
'application/x-gzip': gzip.GzipFile,
}
ext_openers = {
'.bz2': bz2.BZ2File,
'.gz': gzip.GzipFile,
}
def open_autodecompress(filename, mode='r'):
if has_magic:
ms = magic.open(magic.MAGIC_MIME_TYPE)
ms.load()
mimetype = ms.file(filename)
opener = mime_openers.get(mimetype, open)
else:
basepart, ext = os.path.splitext(filename)
opener = ext_openers.get(ext, open)
return opener(filename, mode)