1

I am working on PDF functionality, I want to search text in PDF and highlight the found text in the PDF. For that I am using iTextsharp.

I did not get any solution yet, please provide me with a solution. I have written following code;

   public ActionResult Index1()
    { 
        string outputFile = @"D:\Test.pdf";

        using (FileStream fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None))
        {
            using (Document doc = new Document(PageSize.LETTER))
            {
                using (PdfWriter w = PdfWriter.GetInstance(doc, fs))
                {
                    doc.Open();
                    doc.Add(new Paragraph("This is a test and sample pdf for test and wait for it"));  
                    doc.Close();
                }
            }
        }

        List<int> pages = new List<int>();
        PdfReader pdfReader = new PdfReader(outputFile);
        for (int page = 1; page <= pdfReader.NumberOfPages; page++)
        {
            ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();

            string currentPageText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
            if (currentPageText.Contains("test"))
            {
                pages.Add(page);
            }
        }
        pdfReader.Close();

        //Create a new file from our test file with highlighting
        string highLightFile = @"D:\Test1.pdf"; 

        //Bind a reader and stamper to our test PDF
        PdfReader reader = new PdfReader(outputFile);

        using (FileStream fs = new FileStream(highLightFile, FileMode.Create, FileAccess.Write, FileShare.None))
        {
            using (PdfStamper stamper = new PdfStamper(reader, fs))
            {

                iTextSharp.text.Rectangle rect = new iTextSharp.text.Rectangle(60.6755f, 749.172f, 94.0195f, 735.3f); 
                float[] quad = { rect.Left, rect.Bottom, rect.Right, rect.Bottom, rect.Left, rect.Top, rect.Right, rect.Top };

                PdfAnnotation highlight = PdfAnnotation.CreateMarkup(stamper.Writer, rect, null, PdfAnnotation.MARKUP_HIGHLIGHT, quad);

                //Set the color
                highlight.Color = BaseColor.YELLOW;

                //Add the annotation
                stamper.AddAnnotation(highlight, 1);
            }
        }
        return View();
    }

Above code creates one PDF (test1.pdf) And in another PDF it highlights some text with hard-coded coordinates, I need to find the coordinates of some text in the PDF.

But I could not find the coordinates of the text I'm looking for.

Joris Schellekens
  • 8,483
  • 2
  • 23
  • 54
kushal
  • 181
  • 14

1 Answers1

2

In iText7, we have implemented this functionality. It can be found in the class RegexBasedLocationExtractionStrategy.

I suggest you have a look to see how it was done, since this functionality was not backported to iText5.

Joris Schellekens
  • 8,483
  • 2
  • 23
  • 54
  • HI joris , will you please send me code that you have implemented for this. for my reference. – kushal Sep 21 '17 at 14:30
  • You can download iText7 and inspect the source code yourself. Our code is open source after all. – Joris Schellekens Sep 21 '17 at 14:30
  • 2
    The [`SearchTextLocationExtractionStrategy`](https://github.com/mkl-public/testarea-itext5/blob/master/src/main/java/mkl/testarea/itext5/extract/SearchTextLocationExtractionStrategy.java) in [this answer](https://stackoverflow.com/a/45919560/1729265) can be used with iText 5. – mkl Sep 21 '17 at 14:31