Take a look at my answer to Divide page in 2 parts so we can fill each with different source (this is year another question answered in The Best iText Questions on StackOverflow). In this example, we read a series of text files that are stored in UTF-8. To achieve this, we use this method:
public Phrase createPhrase(String path) throws IOException {
Phrase p = new Phrase();
BufferedReader in = new BufferedReader(
new InputStreamReader(new FileInputStream(path), "UTF8"));
String str;
while ((str = in.readLine()) != null) {
p.add(str);
}
in.close();
return p;
}
If you remove the "UTF8"
and if you read that text as if it were ASCII, then you'd get the same behavior you are describing in your question: each byte would be treated as a single character whereas you have characters that require two bytes.
This is not really an iText question. This is a pure encoding question.