I am working with a text file that has text laid out like below:
SCN DD1251
UPSTREAM DOWNSTREAM FILTER
NODE LINK NODE LINK LINK
DD1271 C DD1271 R
DD1351 D DD1351 B
E
SCN DD1271
UPSTREAM DOWNSTREAM FILTER
NODE LINK NODE LINK LINK
DD1301 T DD1301 A
DD1251 R DD1251 C
SCN DD1301
UPSTREAM DOWNSTREAM FILTER
NODE LINK NODE LINK LINK
DD1271 A DD1271 T
B
C
D
SCN DD1351
UPSTREAM DOWNSTREAM FILTER
NODE LINK NODE LINK LINK
A DD1251 D
DD1251 B
C
I am currently using the following regex pattern to match the Node followed by the 5 wide space and following letter like so:
DD1251 B
[A-Z]{2}[0-9]{3}[0-9A-Z] [A-Z]
My goal is to replace the 5 wide space with an underscore to look like so:
DD1251_B
I am trying to achieve this using the following code:
def RemoveLinkSpace(input_file, output_file, pattern):
with open(str(input_file) + ".txt", "r") as file_input:
with open(str(output_file) + ".txt", "w") as output:
for line in file_input:
line = pattern.sub("_", line)
output.write(line)
upstream_pattern = re.compile(r"[A-Z]{2}[0-9]{3}[0-9A-Z] [A-Z]")
RemoveLinkSpace("File1","File2",upstream_pattern)
However, this results in a text file that looks like the below pattern:
SCN DD1251
UPSTREAM DOWNSTREAM FILTER
NODE LINK NODE LINK LINK
_ C DD1271 R
_ D DD1351 B
E
SCN DD1271
UPSTREAM DOWNSTREAM FILTER
NODE LINK NODE LINK LINK
_ T DD1301 A
_ R DD1251 C
My question is, is there a way to still search for the entire regex, but then to only replace the spaces contained within in?