I'm parsing date/time/measurement info out of some text files that look similar to this:
[Sun Jul 15 09:05:56.724 2018] *000129.32347
[Sun Jul 15 09:05:57.722 2018] *000129.32352
[Sun Jul 15 09:05:58.721 2018] *000129.32342
[Sun Jul 15 09:05:59.719 2018] *000129.32338
[Sun Jul 15 09:06:00.733 2018] *000129.32338
[Sun Jul 15 09:06:01.732 2018] *000129.32352
The results go into an output file like this:
07-15-2018 09:05:56.724, 29.32347
07-15-2018 09:05:57.722, 29.32352
07-15-2018 09:05:58.721, 29.32342
07-15-2018 09:05:59.719, 29.32338
07-15-2018 09:06:00.733, 29.32338
07-15-2018 09:06:01.732, 29.32352
The code that I'm using looks like this:
import os
import datetime
with open('dq_barorun_20180715_calibtest.log', 'r') as fh, open('output.txt' , 'w') as fh2:
for line in fh:
line = line.split()
monthalpha = line[1]
month = datetime.datetime.strptime(monthalpha, '%b').strftime('%m')
day = line[2]
time = line[3]
yearbracket = line[4]
year = yearbracket[0:4]
pressfull = line[5]
press = pressfull[5:13]
timestamp = month+"-"+day+"-"+year+" "+time
fh2.write(timestamp + ", " + press + "\n")
This code works fine and accomplishes what I need, but I'm trying to learn more efficient methods of parsing files in Python. It takes about 30 seconds to process a 100MB file and I have several files that are 1-2GB in size. Is there a faster way parse through this file?