i'm trying to do a search in a PDF using PyPDF and return the page number the search term was found on using re.search. However, when the word has a hyphen in it, it doesn't work. For example, search for "abc-123" returns nothing. I tried the below code and it works for a search of "123" or "abc" but will not return "abc-123". Below is my code, which is from this thread.
# Open the pdf file
pdfFileObj = open('example.pdf', 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
String = 'abc-123'
# Extract text and do the search
for i in range(0, NumPages):
PageObj = pdfReader.getPage(i)
Text = PageObj.extractText()
if re.search(String,Text):
print("Pattern Found on Page: " + str(i))
pdfFileObj.close()
Appreciate any help. Thanks in advance!