ITextSharp for arabic support

Asked Jul 30 '16 at 15:13

Active Jul 30 '16 at 15:13

Viewed 393 times

I have to read the content of a .pdf file, I am using ITextSharp.net, I have three problems:

1- the Arabic terms are extracted in reverse order.( ex: احمد is extracted as دمحا) which is reversed ( in English: Ahmad is extracted as damha ) if my file contains both Arabic and English, How to extract each language with its correct direction.

2- sometimes the glyphs are no defined as characters, so they appear as symbols, how to add my own definition for glyphs?

3- Can I extract the text with its formattings, to convert to html and display the file in a web page as is?

asked Jul 30 '16 at 15:13

أحمد صوالحة

1

If you can detect Arabic terms by iterating over the text you extract, you could use a string reversal algorithm on those sections. – Robert Columbia Jul 30 '16 at 15:16
Can you show a code sample to understand the problem in a clearer manner? – Ashraf Sada Jul 30 '16 at 15:25
The code is itextsharp code..very long – أحمد صوالحة Jul 30 '16 at 17:39
Possible duplicate of [Reading pdf content using iTextSharp in C#](http://stackoverflow.com/questions/10185643/reading-pdf-content-using-itextsharp-in-c-sharp) – Cee McSharpface Nov 14 '16 at 19:33

ITextSharp for arabic support

0 Answers0