I'm looking to extract and modify a specific line of text in many files within a folder but I am having some trouble.
For instance, the first file might read:
To: Bob
From: Bill
<Message> The eagle flies at midnight. <End Message>
The second message is different, but same format, and so on. I'd like to extract the third line, pass 'The eagle flies at midnight.' through a function (like base64), and then put it back on the line between 'Message' and 'End Message'. Such that the final output would read:
To: Bob
From: Bill
<Message> VGhlIGVhZ2xlIGZsaWVzIGF0IG1pZG5pZ2h0Lg== <End Message>
This is what I am trying (and adjusting) so far.
import base64
import os
import io
#ask user where his stuff is / is going
directory = raw_input("INPUT Folder:")
output = raw_input("OUTPUT Folder:")
#get that stuff
myfilepath = os.path.join(directory, '*.txt')
with open('*.txt', 'r') as file:
data = file.readlines()
#Go to line 3 and take out non encoded text.
data[3] = X
X.strip("<Message>")
X.strip("<End Message>")
coded_string = X
#Encode line 3
base64.b64encode(coded_string)
data[3] = '<Message> %s <End Message>' % (coded_string)
# and write everything back
with open('*.txt', 'w') as file:
file.writelines(data)
I'm sure there are numerous problems, particularly with how I am opening and writing back. Bonus points: 99% of the messages in this folder are in this exact format, but there are 1% junk messages (they dont need to be encoded, and line 3 for them is something different). I'm not too worried about them, but if they could be unharmed in the process that'd be nifty. Maybe line 3 should be line 2 if the count starts at 0 ...
Edit: Trying
import re, base64
import os
folder = 'C:/Users/xxx/Desktop/input'
matcher = re.compile("<Message>(?P<text>[^<]*)<End Message>")
for filename in os.listdir(folder):
infilename = os.path.join(folder, filename)
if not os.path.isfile(infilename): continue
base, extension = os.path.splitext(filename)
filein = open(infilename, 'r')
fileout = open(os.path.join(folder, '{}_edit.{}'.format(base, extension)), 'w')
for line in filein:
match = matcher.search(line)
if match:
fileout.write("<message> " + base64.b64encode(match.group('text').strip()) + " <End message>\n")
else:
fileout.write(line)
filein.close()
fileout.close()
Ultimately this gives me a bunch of blank files except for the last one which is translated properly.