0

I am working on a project and i need to read multiple pdf file from a folder and show its content when click on button. I am facing problem to read multiples files at a time. how could i read multiple pdf files.? Is anyone help me.?

protected void btnShowContent_Click(object sender, EventArgs e)
            { 
                //if (fileUpload.HasFile)
                //{

                    foreach (string file in Directory.GetFiles(@"E:\\Rida\","*.pdf"))
                    {
                    string str = "";
                    str = str + ", " + file.ToString();
                    PdfReader reader = new PdfReader(file);
                        string strPDFFile = file.ToString().Trim();
                        StringBuilder strPdfContent = new StringBuilder();
                    string pdfText = strPdfContent.ToString();
                        string contents = File.ReadAllText(strPDFFile);

                        for (int i = 1; i <= reader.NumberOfPages; i++)
                        {
                            ITextExtractionStrategy objExtractStrategy = new SimpleTextExtractionStrategy();
                            string strLineText = PdfTextExtractor.GetTextFromPage(reader, i, objExtractStrategy);
                            strLineText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(strLineText)));
                            strPdfContent.Append(strLineText);
                            strPdfContent.Append(contents);

                            strPdfContent.Append("<br/>");
                        }
                    reader.Close();
                    lblPdfContent.Text = strPdfContent.ToString();
                    }             
            }

This line convert my pdf file content into special characters. What should i Do to avoid this conversion.?

string contents = File.ReadAllText(strPDFFile);
  • Why do you need the line ``string contents = File.ReadAllText();`` at all? The ``strPdfContent`` already has the content?!? – Rand Random Jul 07 '17 at 06:28
  • What do you expect `File.ReadAllText` to do? It won't give you the text contents of the PDF file. PDF is a special format that needs to be interpreted. – Romano Zumbé Jul 07 '17 at 06:29
  • Almost seems like those two lines got entered without any reason, they are just wrong and do nothing... ``string pdfText = strPdfContent.ToString(); string contents = File.ReadAllText(strPDFFile);`` – Rand Random Jul 07 '17 at 06:30
  • without using this line string contents = File.ReadAllText(strPDFFile); it read only first file from folder. – Rida Fatima Jul 07 '17 at 06:35
  • Do you want to display the content of multiple PDF's on the page at once? Could you add a new label for each PDF? – Andrew Jul 07 '17 at 07:01
  • Yes i Want the same.. No m not not use new label for it. @user1653400 – Rida Fatima Jul 07 '17 at 07:16

1 Answers1

0

As far as i know about the PDF parsing and manipulations in .NET environment, you can use iTextSharp, it is a PDF library that allows you to CREATE, ADAPT, INSPECT and MAINTAIN documents in PDF. Use libraries, It can help you solve your problem !

https://sourceforge.net/projects/itextsharp/

http://jadn.co.uk/w/ReadPdfUsingCsharp.htm

Saad
  • 916
  • 1
  • 15
  • 28
  • I am using this library.. but the issue i am facing is when I run my code it just read first file from folder and convert next file into some special Characters. – Rida Fatima Jul 07 '17 at 07:02
  • It has been done, using iTextSharp for reading a single text file, you have to work on it for reading multiple files at once. Go through this post, it may be helpful, as PDF is a special format and it is a bit difficult to work on it ! https://stackoverflow.com/questions/83152/reading-pdf-documents-in-net – Saad Jul 07 '17 at 07:19