0

I m running this code block to convert html page to pdf document.But I did not see Turkish characters on 'result.pdf'.My work is:

        try {
        Rectangle pagesize = new Rectangle(800,1200);
        final Document document = new Document(pagesize);
        OutputStream os = new FileOutputStream("deneme.pdf");// ByteArrayOutputStream();
        PdfWriter writer = PdfWriter.getInstance(document,os); 
        document.open();
        HtmlCleaner cleaner = new HtmlCleaner(); 
        CleanerProperties props = cleaner.getProperties();
        TagNode rootNode = cleaner.clean("Source Html");  

        XmlSerializer serial = new PrettyXmlSerializer(props);
        String htmlClean =  serial.getAsString(rootNode);
        System.out.println(htmlClean);//Tidy Html

        CSSResolver cssResolver = XMLWorkerHelper.getInstance().getDefaultCssResolver(true);
      /*  
        XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider();
       // fontProvider.setUseUnicode(true);
        fontProvider.isRegistered("Helvetica");
        fontProvider.addFontSubstitute("Helvetica", "Arial");

        CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
        */
        // HTML
        HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
        htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
        htmlContext.setImageProvider(new ImageProvider());

        PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
        HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
        CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
      /*  
        BaseFont courier = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.EMBEDDED);
        Font font = new Font(courier, 12, Font.NORMAL);
        Chunk chunk = new Chunk("",font);
        document.add(chunk); 
        */


        // XML Worker
        XMLWorker worker = new XMLWorker(css, true);
        XMLParser p = new XMLParser(worker);
        p.parse(new ByteArrayInputStream(htmlClean.getBytes("utf-8")));

        document.close(); 

        } catch (Exception e) {
            e.printStackTrace();
        }

I tried codes in comment lines but result is same,wrong.

How can I change result with Turkish Characters??

when I tried that code block

            BaseFont freeSans = BaseFont.createFont("FreeSans.ttf","Cp1254", true);
        Font font = new Font(freeSans,12, Font.NORMAL);
        Chunk chunk = new Chunk("ŞşĞğİıÖö",font);
        document.add(chunk); 

I saw 'ŞşĞğİıÖö' in 'result.pdf'

But how can I edit XmlParser before parsing ??

Cankay87
  • 3
  • 7
  • Almost certainly your "Turkish Characters" are not available in the font you use. See http://stackoverflow.com/questions/26631815/cant-get-czech-characters-while-generating-a-pdf – Jongware Jan 27 '15 at 11:24
  • At the link ,it means replace all special characters with codes such as "\u0106" – Cankay87 Jan 27 '15 at 11:47
  • Well, that's just the second mistake the OP made. You make the third: "you assume that Helvetica is a font that knows how to draw these glyphs". – Jongware Jan 27 '15 at 12:18
  • 1
    3.solution work success , but how can edit my own source code , I havent succeed yet .And I can not convert html to pdf adding 'Paragraph'.I havent change font. – Cankay87 Jan 27 '15 at 13:19
  • I tried with tff files also and a lot of combination but any change..When ı write Sys.out the my Html ,prints however, when xmlparser parse myHtml , no turkish letters.I used Ibm WAS extra information..AnyBody help please... – Cankay87 Jan 27 '15 at 15:18
  • Hi , I solved this problem with using 'CssResolver.addCss()' method.Now,Every special characters seems good. – Cankay87 Mar 31 '15 at 09:04

0 Answers0