I have a number of files where I want to replace all instances of a specific string with another one.
I currently have this code:
mappings = {'original-1': 'replace-1', 'original-2': 'replace-2'}
# Open file for substitution
replaceFile = open('file', 'r+')
# read in all the lines
lines = replaceFile.readlines()
# seek to the start of the file and truncate
# (this is cause i want to do an "inline" replace
replaceFile.seek(0)
replaceFile.truncate()
# Loop through each line from file
for line in lines:
# Loop through each Key in the mappings dict
for i in mappings.keys():
# if the key appears in the line
if i in line:
# do replacement
line = line.replace(i, mappings[i])
# Write the line to the file and move to next line
replaceFile.write(line)
This works ok, but it is very slow for the size of the mappings and the size of the files I am dealing with.
For instance, in the "mappings" dict there are 60728 key value pairs. I need to process up to 50 files and replace all instances of "key" with the corresponding value, and each of the 50 files is approximately 250000 lines.
There are also multiple instances where there are multiple keys that need to be replaced on the one line, hence I cant just find the first match and then move on.
So my question is:
Is there a faster way to do the above? I have thought about using a regex, but I am not sure how to craft one that will do multiple in-line replaces using key/value pairs from a dict.
If you need more info, let me know.