2

I have a problem with encoding polish characters in PoDoFo library.

This code generates invalid encoding of word 'Łódź'

#include <podofo/podofo.h>

using namespace PoDoFo;

int main(int argc, char *argv[], char *env[]) {
    PdfStreamedDocument document("polish.pdf");
    PdfPainter painter;
    PdfPage* pPage;


    pPage = document.CreatePage( PdfPage::CreateStandardPageSize( ePdfPageSize_A4 ) );
    painter.SetPage( pPage );
    PdfFont* pFont = document.CreateFont("Helvetica");
//  PdfFont* pFont = document.CreateFont("Helvetica", new PdfIdentityEncoding(0, 0xffff, true) );
    PdfString pString("Polish word: Łódź");
//  PdfString pString(reinterpret_cast<const pdf_utf8*>("Polish word: Łódź"));

    painter.SetFont( pFont );
    painter.DrawText( 100.0, pPage->GetPageSize().GetHeight()-100.0, pString );
    painter.FinishPage();
    document.Close();

    return 0;
}

In output pdf there is

Polish word: ņÃ3dÅo

I've tried to change encoding of source string (using commented lines in sample code), but all failed.

Can somebody explain me how to create PDF document containing non-ascii characters using PoDoFo library?

Markoj
  • 233
  • 2
  • 7
  • Two obvious errors. You are expecting the PDF renderer to be able to parse UTF8 (which you are using, whether you knew it or not). Second, you are expecting the built-in Helvetica to contain Polish characters -- see http://stackoverflow.com/questions/26631815/cant-get-czech-characters-while-generating-a-pdf for more on that. – Jongware Jan 13 '15 at 18:14
  • @Jongware I've modified my code, and tried to add embeded font with polish characters: Liberation-Serif, here's the code: PdfFont* pFont = document.CreateFontSubset("LiberationSerif", false, false, PdfEncodingFactory::GlobalWinAnsiEncodingInstance(), "fonts/LiberationSerif-Regular.ttf"); and there is no succeed... – Markoj Jan 19 '15 at 12:20

3 Answers3

1
#include <podofo/podofo.h>

using namespace PoDoFo;

int main(int argc, char *argv[], char *env[]) {
    PdfStreamedDocument document("polish.pdf");
    PdfPainter painter;
    PdfPage* pPage;


    pPage = document.CreatePage( PdfPage::CreateStandardPageSize( ePdfPageSize_A4 ) );
    painter.SetPage( pPage );
    const PdfEncoding* pEncoding = new PdfIdentityEncoding(); // required for UTF8 characters
    PdfFont *pFont = document.CreateFont("LiberationSerif", false, false, pEncoding ); // LiberationSerif has polish characters 
    PdfString pString(reinterpret_cast<const pdf_utf8*>("Polish word: Łódź")); // Need to cast input string into pdf_utf8
    painter.SetFont( pFont );
    painter.DrawText( 100.0, pPage->GetPageSize().GetHeight()-100.0, pString );
    painter.FinishPage();
    document.Close();

    return 0;
}

This code worked for me.

Markoj
  • 233
  • 2
  • 7
0

Using PdfIdentityEncoding fixes display of polish characters, but using this encoding the character spacing is wrong.

SHR
  • 7,940
  • 9
  • 38
  • 57
0

In my case for Spanish tilde work with:

PdfFont* pFont = document.CreateFont( "Arial"); PdfString pString(reinterpret_cast("máma mía"); ...

In other way special characters may be wrong printed..

halfelf
  • 9,737
  • 13
  • 54
  • 63