I expect this is really simple but I can't work this out.
I am trying to write to a log file in real time the output from a DD
imaging subprocess - I'm using DD v 8.25
from which you can get regular progress updates using the 'status=progress'
option which writes to stderr
.
I can get it to log the full output real time by passing the file object to the stderr
i.e
log_file = open('mylog.log', 'a')
p = subprocess.Popen['dd command...'], stdout=None, stderr=log_file)
...but I would prefer to intercept the string from stderr
first so I can parse it before writing to file.
I have tried threading but I can't seem to get it to write, or if it does, it only does it at the end of the process and not during.
I'm a python noob so example code would be appreciated. Thanks!
UPDATE - NOW WORKING (ISH)
I had a look at the link J.F. Sebastian suggested and found posts about using threads, so after that I used the "kill -USR1" trick to get DD to post progress to stderr which I could then pick up:
#! /usr/bin/env python
from subprocess import PIPE, Popen
from threading import Thread
from queue import Queue, Empty
import time
q = Queue()
def parsestring(mystr):
newstring = mystr[0:mystr.find('bytes')]
return newstring
def enqueue(out, q):
for line in proc1.stderr:
q.put(line)
out.close()
def getstatus():
while proc1.poll() == None:
proc2 = Popen(["kill -USR1 $(pgrep ^dd)"], bufsize=1, shell=True)
time.sleep(2)
with open("log_file.log", mode="a") as log_fh:
start_time = time.time()
#start the imaging
proc1 = Popen(["dd if=/dev/sda1 of=image.dd bs=524288 count=3000"], bufsize=1, stderr=PIPE, shell=True)
#define and start the queue function thread
t = Thread(target=enqueue, args=(proc1.stderr, q))
t.daemon = True
t.start()
#define and start the getstatus function thread
t_getstatus = Thread(target=getstatus, args=())
t_getstatus.daemon
t_getstatus.start()
#get the string from the queue
while proc1.poll() == None:
try: nline = q.get_nowait()
except Empty:
continue
else:
mystr = nline.decode('utf-8')
if mystr.find('bytes') > 0:
log_fh.write(str(time.time()) + ' - ' + parsestring(mystr))
log_fh.flush()
#put in a delay
#time.sleep(2)
#print duration
end_time=time.time()
duration=end_time-start_time
print('Took ' + str(duration) + ' seconds')
The only issue is I can't work out how to improve performance. I only need it to report status every 2 seconds or so but increasing the time delay increases the time of the imaging, which I don't want. That's a question for another post though...
Thanks to both J.F. Sebastian and Ali.