English letters are part of basic ASCII char set so the output is usually without any problems however any other languages using various accents or even different letters, ie. Arabic, Azbuka, Greek, etc. uses letters out of the basic set.
Make sure all three sources are using same encoding:
- all the PHP scripts generating the output
- the HTML encoding meta tag
- the output file as well
ad 1
Check your editor how it saves the PHP scripts to the file system. The way how to set it up differs from each editor
ad 2
Use HTML meta tag <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
ad 3
define the encoding to use UTF-8
for example: pdftotext -enc UTF-8 your.pdf
. According to the documentation the PdfToText class generates UTF8-encoded text.