I am looking to delete any text from a string in python that matches something along the lines of "\nPage 10 of 12\n" where 10 and 12 are always different numbers (looping through 300+ documents that all have different page lengths). Example of some text that is in my string below (and then what i would want the output to be):
thisisaboutthen\n\n\nPage 2 of 12\n\nnowwearegoing\n\nPage 3 of 12\n\n\n\
Output -> thisisaboutthennnowwearegoing
I am trying the code:
page = r'\nPage \b\d+\b of \b\d+\b\n+'
return re.sub(page, '', string)
But I can't get it to work. I tried to refer to this link Python: Extract numbers from a string for help but I can't seem to combine numbers and letters together.
I'm new to regex in python and any help would be great. I have been able to get regex to work when it is just letters or just numbers, but running into problems when combining them.
Thanks in advance