I need help in a PYTHON script to read PDF file and copy every word on it and put them in a new .txt file (every word must take 1 line) ; and then deleted the repeated words and count them after that and print the count in the last line
2 Answers
Did you search the Stackoverflow for answers?
Here you can find some pretty good answers about how to extract text from a pdf file (Look at Jakobovski answer): How to extract text from a PDF file?
Here you can find information about writing/editing/creating .txt files: https://www.guru99.com/reading-and-writing-files-in-python.html

- 472
- 5
- 19
-
I didn't find what I want , if you know how to write the script can you write it please ? – AbdulRhman Fawzy Apr 26 '19 at 01:49
Install these libraries.
PyPDF2 (To convert simple, text-based PDF files into text readable by Python)
textract (To convert non-trivial, scanned PDF files into text readable by Python)
nltk (To clean and convert phrases into keywords)
Each of these libraries can be installed with the following commands in side terminal(on macOS):
pip install Libraryname
See this Tutorial https://medium.com/@rqaiserr/how-to-convert-pdfs-into-searchable-key-words-with-python-85aab86c544f
Use texttrack it support many types of files also PDF. So texttrack better.
folow these links

- 125
- 1
- 14
-
-
Abdul Rhaman simply open command prompt write cd and give path of script folder then just write pip install textrace and enter your textrace libaray will start installation – MIH Apr 26 '19 at 15:07
-
-
Study these link that i provide above they will solve your problem Inshallah. – MIH Apr 26 '19 at 15:10