0

I need write PDF File, and I use this sample(http://www.vogella.com/tutorials/JavaPDF/article.html) with this version "itextpdf-5.4.1.jar".

This create the PDF file, but when the word has "você" write this "você".

I find this code but has not work:

Document document; ... ... document.addLanguage("pt-BR");

How set encoding or language to Brasil?

Thanks!

Bruno Lowagie
  • 75,994
  • 9
  • 109
  • 165
Paulo Kussler
  • 183
  • 1
  • 3
  • 10
  • 1
    As your single letter 'ê' becomes two characters, could it be that your source file is UTF8 encoded but your compiler assumes some single byte encoding? – mkl Aug 23 '15 at 07:43
  • 3
    This is explained in great detail in [The Best iText Questions on StackOverflow](http://pages.itextpdf.com/ebook-stackoverflow-questions.html); this is a free ebook with answers to questions such as [Can't get Czech characters while generating a PDF](http://stackoverflow.com/questions/26631815/cant-get-czech-characters-while-generating-a-pdf). In your case, your encoding problem is probably caused already at the moment you are reading your source file. – Bruno Lowagie Aug 23 '15 at 07:44
  • Thanks for all! My problem was really the Android Studio File encoding, how @mkl and bruno-lowagie has explained "This is not really an iText question. This is a pure encoding question." Thanks – Paulo Kussler Aug 23 '15 at 13:33

1 Answers1

2

Take a look at my answer to Divide page in 2 parts so we can fill each with different source (this is year another question answered in The Best iText Questions on StackOverflow). In this example, we read a series of text files that are stored in UTF-8. To achieve this, we use this method:

public Phrase createPhrase(String path) throws IOException {
    Phrase p = new Phrase();
    BufferedReader in = new BufferedReader(
        new InputStreamReader(new FileInputStream(path), "UTF8"));
    String str;
    while ((str = in.readLine()) != null) {
        p.add(str);
    }
    in.close();
    return p;
}

If you remove the "UTF8" and if you read that text as if it were ASCII, then you'd get the same behavior you are describing in your question: each byte would be treated as a single character whereas you have characters that require two bytes.

This is not really an iText question. This is a pure encoding question.

Community
  • 1
  • 1
Bruno Lowagie
  • 75,994
  • 9
  • 109
  • 165