How can I read Unicode pdf using itextsharp in c#?

Question

I am working on Unicode [Marathi] based project and for this project my task is to read Unicode text from a PDF in the following fonts

When I read the PDF using iTextSharp, I get the text as:

ररजज - (एस 13) महरररषष
पररप मतदरर जरदद 2014

where the actual text should be

राज्य - (एस-१३) महाराष्ट्र  प्रारूप मतदार यादी २०१४

Please give me solution if anyone have idea about this.

Your output shows you are already reading "Unicode" (if that failed you would not have seen Marathi). Can you provide a link to a sample PDF with this behavior? — Jongware, Jan 09 '14 at 12:49
Please check this is the link for sample pdf.[Click here to view sample pdf](https://www.dropbox.com/s/ezz015t3qdqo5hk/test.pdf) — Pandurang Pailvan, Jan 09 '14 at 13:17
This PDF has *exactly* the same problem as described in http://stackoverflow.com/a/15566820/2564301 -- up to and including the same duplicate Unicode code points. This can't be solved by iTextSharp, nor by Acrobat Pro. — Jongware, Jan 09 '14 at 13:43
hi, i am also creating pdf using itextsharp, but when pdf is printed with marathi text, some joint word are not printed correctly. for ex- मिरची printed as मरीची , पत्ते printed as पतेते .please give any solution. — banny, Mar 19 '14 at 08:52

0 Answers0