4

This is a follow-up on this question How to export fonts in Gujarati-Indian Language to pdf?, @amedee-van-gasse, QA Engineer at iText asked me to post a question specific to itext with relevant mcve.

Why is this sequence of unicode \u0ab9\u0abf\u0aaa\u0acd\u0ab8 not rendered correctly?

It should be rendered like this:

હિપ્સ , also tested with unicode-converter

However this code (example adapted form iText: Chapter 11: Choosing the right font)

public class FontTest {

    /** The resulting PDF file. */
    public static final String RESULT = "fontTest.pdf";
    /** the text to render. */
    public static final String TEST = "\u0ab9\u0abf\u0aaa\u0acd\u0ab8";

    public void createPdf(String filename) throws IOException, DocumentException {
        Document document = new Document();
        PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(filename));
        document.open();
        BaseFont bf = BaseFont.createFont(
            "ARIALUNI.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
        Font font = new Font(bf, 20);
        ColumnText column = new ColumnText(writer.getDirectContent());
        column.setSimpleColumn(36, 730, 569, 36);
        column.addElement(new Paragraph(TEST, font));
        column.go();
        document.close();
        System.out.println("DONE");
    }

    public static void main(String[] args) throws IOException, DocumentException {
        new FontTest().createPdf(RESULT);
    }
}

Generates this result:

pdf output

That looks different from

હિપ્સ

I have test with itextpdf-5.5.4.jar,itextpdf-5.5.9.jar and also itext-2.1.7.js3.jar (distributed with jasper-reports)

The font used it the one distributes with MS Office ARIALUNI.TTF and it can be download from here Arial Unicode MS *Maybe there are some legal issues downloading see Mike 'Pomax' Kamermans comment

Community
  • 1
  • 1
Petter Friberg
  • 21,252
  • 9
  • 60
  • 109
  • 1
    Note that your download link is ... not 100% legal. Arial Unicode comes bundled with Microsoft Office for free, but that doesn't make the font itself free. If you look at http://www.fonts.com/font/monotype/arial-unicode, it's quite clear that this is a *very expensive font* (US$370 for the two families, if you didn't buy Office). – Mike 'Pomax' Kamermans Apr 15 '16 at 22:20
  • 1
    Another question is who is the problem here - what text shaper does iText rely on, and have you tried seeing what *it* does when presented with the Unicode sequence and font resource? This could be iText, but it could also be whatever Java shaper iText relies on. – Mike 'Pomax' Kamermans Apr 15 '16 at 22:25
  • 1
    FYI only: Guajarati requires reordering the glyphs *before* the font's own OpenType features are applied. There is no code inside that font file to reorder; it's left to the renderer software to preprocess the string first. See also [Microsoft's notes on Gujarati](https://www.microsoft.com/typography/OpenTypeDev/gujarati/intro.htm#reor). – Jongware Apr 16 '16 at 01:09
  • 3
    iText(Sharp) currently doesn't support ligatures. The next version of iTextSharp (to be presented at the Great Indian Developer Summit in Bangalore in about a week) will support ligatures, but the typography addon that will add this support will not be offered as open source. We decided to make it a closed source, commercial addon because too many people think that open source is a synonym of "free of charge" (which it isn't). – Bruno Lowagie Apr 16 '16 at 05:52
  • Thanks for your comment Bruno, message received, good luck at Summit. – Petter Friberg Apr 16 '16 at 20:30

2 Answers2

12

Neither iText5 nor iText2 (which is a very outdated version by the way) support rendering of Indic scripts, no matter which font you select.

Rendering Indic scripts is not similar to any Latin scripts, because a long series of additional actions should be taken to get the correct result, e.g. some characters need to be reordered first according to the language rules.

This is a known issue to iText company.

There is a stub implementation for Gujaranti in iText5 called GujaratiLigaturizer, but the implementation is really poor and you cannot expect to get correct results with it.

You can try to process your string with this ligaturizer and then output the resultant string in the following way:

IndicLigaturizer g = new GujaratiLigaturizer();
String processed = g.process(inputString);
// proceed with the processed string
Alexey Subach
  • 11,903
  • 7
  • 34
  • 60
  • Thanks for your answer, I will test it, do you know of any better implementations, FYI the older version (itext2) because this is distributed with latest jasper report distribution (I think they had some problem with pdf/a or legal issue). – Petter Friberg Apr 15 '16 at 22:54
  • thanx a ton.. wasted a lot of hours trying to find a fix... – Shoaeb Jul 01 '20 at 16:52
0

Build your application using latest typography jar file that Will solve your problem of Gujarati font rendering in pdf In itext.

Md Ajmat
  • 1
  • 2