I have a PDF that has pages with 1 column and other pages with 2 or 3 columns.
How do I get correctly read EVERY page?
Using the code below I realized that does not work properly:
PdfReader pdfreader = new PdfReader(nmfile);
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
for (int page = 1; page <= pdfreader.NumberOfPages; page++)
{
extractText = PdfTextExtractor.GetTextFromPage(pdfreader, page, strategy);
extractText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(extractText)));
//...
}