there are some keywords I am gotten before and I want to search on pdf document via python and highlight them. Is it viable with some library like pdfMiner?
Asked
Active
Viewed 2,514 times
7
-
If you're on a mac, it may be better to use AppleScript via osascript – scohe001 Sep 09 '13 at 01:38
-
nope, on linux machine – erogol Sep 09 '13 at 13:17
-
Possible duplicate of [read, highlight, save PDF programmatically](https://stackoverflow.com/questions/7605577/read-highlight-save-pdf-programmatically) – Martin Thoma Jul 13 '17 at 13:29
1 Answers
4
Yes, you can use 'PyMuPDF' library. pip install PyMuPDF.
Then use the following code,
import fitz
### READ IN PDF
doc = fitz.open(r"D:\XXXX\XXX.pdf")
page = doc[0]
text = "Amey"
text_instances = page.searchFor(text)
### HIGHLIGHT
for inst in text_instances:
print(inst, type(inst))
highlight = page.addHighlightAnnot(inst)
### OUTPUT
doc.save(r"D:\XXXX\XXX.pdf", garbage=4, deflate=True, clean=True)

Amey P Naik
- 710
- 1
- 8
- 18