I have two folders with RTF files, the filenames are same in both folders
RTF files consists of Header and Footer in every page
I have to ignore the Header and Footer part in every page and compare the content of body part
I did something like this so far, Im not sure whether this is the right way but it works(reads and compare the exact content)
txt_1 = 'D:\\files\\1'
txt_2 = 'D:\\files\\2'
fol_1 = []
fol_2 = []
for fname in os.listdir(path=txt_1):
fol_1.append(fname)
for fname in os.listdir(path=txt_2):
fol_2.append(fname)
for i in fol_1:
for j in fol_2:
if i == j:
file_1 = open(txt_1 + '\\' + i).read()
file_2 = open(txt_2 + '\\' + j).read()
if file_1 == file_2:
print('Matches')
else:
print('Files didnt match: ' + i)
Is there any way to ignore the Header and Footer part in every page
Or is there any way to ignore a String/Line in the footer/header part
Is there any python modules i should look for,
Please give me some suggestions, thank you!