I want to extract bangla text from a pdf file using iTextSharp NuGet in c#. In this pdf text is like this: মোঃ শুকুকুর আলী , মোঃ জালাল মিয়া. I want to read this texts as like this. But when I read this in c# using iTextSharp. return �মাঃ জা লাল িম য়া, �মাঃ �ক ু র আলী. How to solve this problem? I'm attaching my pdf file and code here.
My controller code
using (PdfReader reader = new PdfReader(path))
{
for (int pageNo = 1; pageNo <= 1; pageNo++)
{
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
string text = PdfTextExtractor.GetTextFromPage(reader, pageNo, strategy);
}
reader.Close();
}
In the text variable extracted texts showing like broken.
and my pdf file link https://drive.google.com/drive/folders/1L18hGoBaSQl8xCUIXVpWUbOnhWtsSPRi?usp=share_link
and pdf font details: