1

Highlighted words are not displaying in browser using itextsharp.

Adobe

enter image description here

Browser

enter image description here

CODE

 List<iTextSharp.text.Rectangle> MatchesFound = strategy.GetTextLocations(splitText[i].Trim(), StringComparison.CurrentCultureIgnoreCase);
                    foreach (Rectangle rect in MatchesFound)
                    {
                        float[] quad = { rect.Left - 3.0f, rect.Bottom, rect.Right, rect.Bottom, rect.Left - 3.0f, rect.Top + 1.0f, rect.Right, rect.Top + 1.0f };
                        //Create our hightlight
                        PdfAnnotation highlight = PdfAnnotation.CreateMarkup(stamper.Writer, rect, null, PdfAnnotation.MARKUP_HIGHLIGHT, quad);
                        //Set the color
                        highlight.Color = BaseColor.YELLOW;
                       
                        //Add the annotation
                        stamper.AddAnnotation(highlight, pageno);
                        
                    }

Kindly help me to solve this issue.

Updaetd Code

  private void highlightPDF()
{
    //Create a simple test file
    string outputFile = Server.MapPath("~/pdf/16193037V_Dhana-FI_NK-QA_Completed.pdf");
    string filename = "HL" + Convert.ToString(Session["Filename"]) + ".pdf";
    Session["Filename"] = "HL" + Convert.ToString(Session["Filename"]);
    //Create a new file from our test file with highlighting
    string highLightFile = Server.MapPath("~/pdf/" + filename);

    //Bind a reader and stamper to our test PDF

    PdfReader reader = new PdfReader(outputFile);
    iTextSharp.text.pdf.PdfContentByte canvas;
    int pageno = Convert.ToInt16(txtPageno.Text);
    using (FileStream fs = new FileStream(highLightFile, FileMode.Create, FileAccess.Write, FileShare.None))
    {
        using (PdfStamper stamper = new PdfStamper(reader, fs))
        {
            canvas = stamper.GetUnderContent(pageno);
            myLocationTextExtractionStrategy strategy = new myLocationTextExtractionStrategy();
            strategy.UndercontentCharacterSpacing = canvas.CharacterSpacing;
            strategy.UndercontentHorizontalScaling = canvas.HorizontalScaling;

            string currentText = PdfTextExtractor.GetTextFromPage(reader, pageno, strategy);
            string text = txtHighlight.Text.Replace("\r\n", "").Replace("\\n", "\n").Replace("  ", " ");
            string[] splitText = text.Split(new string[] { "\n" }, StringSplitOptions.RemoveEmptyEntries);
            for (int i = 0; i < splitText.Length; i++)
            {
                List<iTextSharp.text.Rectangle> MatchesFound = strategy.GetTextLocations(splitText[i].Trim(), StringComparison.CurrentCultureIgnoreCase);
                foreach (Rectangle rect in MatchesFound)
                {
                    canvas.SaveState();
                    canvas.SetColorFill(BaseColor.YELLOW);
                    canvas.Rectangle(rect);
                    canvas.Fill();
                    canvas.RestoreState();                      
                }
            }

        }
    }
    reader.Close();      


}

It's not highlighting the text. I passed the text and page no to highlight the text.

Karthik
  • 91
  • 2
  • 13
  • That's not an iText problem. That's a problem of the PDF viewer you are using in the browser and you're not telling which PDF viewer that is. It could be Chrome's PDF viewer; in that case, make it a Chrome PDF viewer question. It could be pdf.js in Firefox; in that case, make it a pdf.js question. Don't blame iTextSharp for the flaws of a PDF viewer. – Bruno Lowagie Nov 27 '15 at 08:15
  • I tested in pdf.js and chrome browser also – Karthik Nov 27 '15 at 08:19
  • So what you've established is that both Chrome PDF viewer and pdf.js completely ignore Markup annotations. Have you asked the developers of pdf.js and Chrome if that diagnosis is correct and have you asked them when they plan to fix that problem? – Bruno Lowagie Nov 27 '15 at 08:25
  • i refered this article http://stackoverflow.com/questions/29032422/highlight-keywords-in-a-pdf-using-itextsharp-and-render-it-to-the-browser they said you partially have your answer already and it is just that those PDF renderers don't fully support the entire PDF syntax. Specifically, (and this is just an educated guess) it appears that those renders require an Appearance entry to exist for those annotations. – Karthik Nov 27 '15 at 08:27
  • OK, so you have your answer. Now it's a matter of waiting until the Chrome and pdf.js developers meet your requirement by implementing ISO-32000-1 correctly. – Bruno Lowagie Nov 27 '15 at 08:29
  • Do you any alternative solution for that? Please it's very urgent. – Karthik Nov 27 '15 at 08:30
  • Client have given a pdf. In that, they highlighted text, the highlighted text is displayed in browser – Karthik Nov 27 '15 at 08:32
  • Maybe that PDF doesn't use Markup annotations. Maybe the highlighted text is created by adding an extra layer (a colored rectangle) under the existing text. That's different from using a Markup annotation. – Bruno Lowagie Nov 27 '15 at 08:33
  • Sorry, Any solution for this problem? – Karthik Nov 27 '15 at 08:35
  • The solution was implied in my previous comment: add a yellow rectangle under the content you want to highlight. – Bruno Lowagie Nov 27 '15 at 08:40
  • how to draw a rectangle under the content? do you have any code? – Karthik Nov 27 '15 at 08:43

2 Answers2

7

First of all...

Why does the OP's (updated) code not work

There actually are two factors.

First of all, there is an issue in the OP's code, to add a rectangle to a path he uses

canvas.Rectangle(rect);

Unfortunately this does not what he expects: The Rectangle class has multiple properties beyond the mere coordinates of a rectangle, foremost information about selected borders, border colors, and an interior color, and PdfContentByte.Rectangle(Rectangle) draws a rectangle according to those properties.

In the case at hand, though, rect is used only to transport the coordinates of a rectangle, so those additional properties all are false or null. Thus, canvas.Rectangle(rect) does nothing!

Instead the OP should use

canvas.Rectangle(rect.Left, rect.Bottom, rect.Width, rect.Height);

here.

Furthermore, @Bruno mentioned in his answer

Note that you won't see the yellow rectangle if you add it under an opaque shape (e.g. under an image).

Unfortunately exactly this is the case here: The document actually is a scanned document, each page been a page-filling image under which the equivalent text is drawn (probably after OCR'ing) to allow textual copy&paste.

Thus, whatever the OP's code may draw on the UnderContent, it will be hidden by that very image.

Thus, let's try something different...

How to make it work

@Bruno in his answer also indicated a solution for such a case:

In that case, you could add a transparent rectangle on top of the existing content.

Following this advice we replace

canvas = stamper.GetUnderContent(pageno);

by

canvas = stamper.GetOverContent(pageno);

PdfGState state = new PdfGState();
state.FillOpacity = .3f;
canvas.SetGState(state);

Selecting the word "support" on the third document page we get:

using an opacity of .3

The yellow is quite pale here.

Using an Opacity value of .6 instead we get

using an opacity of .6

Now the yellow is more intense but the text starts to pale out.

For tasks like this I actually prefer using the blend mode Darken. This can be done by using

state.BlendMode = new PdfName("Darken");

instead of state.FillOpacity = .3f. This results in

using the blend mode Darken

This IMO looks better.

How the client did it

The OP commented

Client have given a pdf. In that, they highlighted text, the highlighted text is displayed in browser

The client's PDF actually uses annotations, just like the OP in his original code, but in contrast each of the client's annotations contains an appearance stream which the highlight annotations generated by iText don't.

Supplying an appearance is optional and PDF viewers indeed should generate an appearance if none is given. Obviously, though, there are numerous PDF viewers which rely on appearances the PDF brings along.

By the way, the appearances in the client's PDF actually use the blend mode Multiply. For underlying white and black colors, Darken and Multiply have the same result.

Making it work with annotations

In a comment the OP wondered

Please one more doubt, if the user wrongly highlighted then how to remove yellow color(or change yellow to white)? i changed yellow to white but it's not working. canvas.SetColorFill(BaseColor.WHITE);

Undoing a change to the page content generally is more difficult than undoing the addition of an annotation. Thus, let's make the OP's original code also work, i.e. adding an appearance stream to the highlight annotations.

As the OP reported in another comment, his first attempt to add an appearance stream failed:

PdfAppearance appearance = PdfAppearance.CreateAppearance(stamper.Writer, rect.Width, rect.Height);
appearance.Rectangle(rect.Left, rect.Bottom, rect.Width, rect.Height);
appearance.SetColorFill(BaseColor.WHITE);
appearance.Fill();
highlight.SetAppearance( PdfAnnotation.APPEARANCE_NORMAL, appearance );
stamper.AddAnnotation(highlight, pageno);

but it's not working.

The problems in his attempt are:

  • The origin of the appearance template is in the lower left corner of the annotation area, not of the page. To color the area in question, therefore, the rectangle must have its lower left at (0, 0).
  • Strictly speaking the color must be set before starting the path building.
  • A different color than white should be used for highlighting.
  • Transparency or an appropriate rendering mode should be used to allow the original, marked text to shine through.

Thus, the following code shows how to do it.

private void highlightPDFAnnotation(string outputFile, string highLightFile, int pageno, string[] splitText)
{
    PdfReader reader = new PdfReader(outputFile);
    iTextSharp.text.pdf.PdfContentByte canvas;
    using (FileStream fs = new FileStream(highLightFile, FileMode.Create, FileAccess.Write, FileShare.None))
    {
        using (PdfStamper stamper = new PdfStamper(reader, fs))
        {
            myLocationTextExtractionStrategy strategy = new myLocationTextExtractionStrategy();
            strategy.UndercontentHorizontalScaling = 100;

            string currentText = PdfTextExtractor.GetTextFromPage(reader, pageno, strategy);
            for (int i = 0; i < splitText.Length; i++)
            {
                List<iTextSharp.text.Rectangle> MatchesFound = strategy.GetTextLocations(splitText[i].Trim(), StringComparison.CurrentCultureIgnoreCase);
                foreach (Rectangle rect in MatchesFound)
                {
                    float[] quad = { rect.Left - 3.0f, rect.Bottom, rect.Right, rect.Bottom, rect.Left - 3.0f, rect.Top + 1.0f, rect.Right, rect.Top + 1.0f };
                    //Create our hightlight
                    PdfAnnotation highlight = PdfAnnotation.CreateMarkup(stamper.Writer, rect, null, PdfAnnotation.MARKUP_HIGHLIGHT, quad);
                    //Set the color
                    highlight.Color = BaseColor.YELLOW;

                    PdfAppearance appearance = PdfAppearance.CreateAppearance(stamper.Writer, rect.Width, rect.Height);
                    PdfGState state = new PdfGState();
                    state.BlendMode = new PdfName("Multiply");
                    appearance.SetGState(state);
                    appearance.Rectangle(0, 0, rect.Width, rect.Height);
                    appearance.SetColorFill(BaseColor.YELLOW);
                    appearance.Fill();

                    highlight.SetAppearance(PdfAnnotation.APPEARANCE_NORMAL, appearance);

                    //Add the annotation
                    stamper.AddAnnotation(highlight, pageno);
                }
            }
        }
    }
    reader.Close();
}

These annotation are displayed by Chrome, too, and as annotations they can easily be removed.

mkl
  • 90,588
  • 15
  • 125
  • 265
  • Great answer, thanks for filling in while I was at a meeting in Brussels, @mkl. – Bruno Lowagie Nov 27 '15 at 18:24
  • Thank you so much for your great help. It's working fine. Once again thank you for your time and effort. – Karthik Nov 28 '15 at 04:32
  • Please one more doubt, if the user wrongly highlighted then how to remove yellow color(or change yellow to white)? i changed yellow to white but it's not working. canvas.SetColorFill(BaseColor.WHITE); – Karthik Nov 28 '15 at 08:00
  • Removing definitely is easier when using annotations. Probably you should apply the marker drawing code to a template and attach that template to an annotation. – mkl Nov 28 '15 at 10:09
  • If you don't mind, please provide the code. I used your above code to draw a rectangle. – Karthik Nov 28 '15 at 10:26
  • float[] quad = { rect.Left , rect.Bottom, rect.Right, rect.Bottom, rect.Left , rect.Top , rect.Right, rect.Top }; PdfAnnotation highlight = PdfAnnotation.CreateMarkup(stamper.Writer, rect, null, PdfAnnotation.MARKUP_HIGHLIGHT, quad); highlight.Color = BaseColor.WHITE; stamper.AddAnnotation(highlight, pageno); did you say like this? it's not working. – Karthik Nov 28 '15 at 10:48
  • That plus creating an appearance stream in a template and adding it as the normal appearance of the annotation. – mkl Nov 28 '15 at 10:59
  • Please provide me the code. Kindly help me. I'm also trying but i did not get. – Karthik Nov 28 '15 at 11:01
  • ( I'm not at an IDE for the weekend, so no code before Monday.) – mkl Nov 28 '15 at 11:02
  • OK i'm waiting, please provide me the code on monday. Sorry for the disturbance. I don't know much in this area. – Karthik Nov 28 '15 at 11:03
  • PdfAppearance appearance = PdfAppearance.CreateAppearance(stamper.Writer, rect.Width, rect.Height); appearance.Rectangle(rect.Left, rect.Bottom, rect.Width, rect.Height); appearance.SetColorFill(BaseColor.WHITE); appearance.Fill(); highlight.SetAppearance( PdfAnnotation.APPEARANCE_NORMAL, appearance ); stamper.AddAnnotation(highlight, pageno); but it's not working. – Karthik Nov 28 '15 at 11:44
  • Cf. my edit, It shows how to add highlight annotations which are displayed by Chrome. *if the user wrongly highlighted then* simply remove these annotations from the page's annotations. – mkl Nov 30 '15 at 10:01
  • I just added a missing initialization which could cause errors, cf. [this answer](http://stackoverflow.com/a/35177980/1729265). – mkl Feb 03 '16 at 13:11
3

You are using a Markup annotation to highlight text. That's great! There's nothing wrong with your code, nor with iText. However: not all PDF viewers support that functionality.

If you want to see highlighted text in every PDF viewer, a (sub-optimal) workaround could be to add a yellow rectangle to the content stream under the existing content (assuming that the existing content isn't opaque).

This is demonstrated in the HighLightByAddingContent example:

public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
    PdfReader reader = new PdfReader(src);
    PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
    PdfContentByte canvas = stamper.getUnderContent(1);
    canvas.saveState();
    canvas.setColorFill(BaseColor.YELLOW);
    canvas.rectangle(36, 786, 66, 16);
    canvas.fill();
    canvas.restoreState();
    stamper.close();
    reader.close();
}

In this example, we take a file named hello.pdf and we add a yellow rectangle, with the file hello_highlighted.pdf as result.

Note that you won't see the yellow rectangle if you add it under an opaque shape (e.g. under an image). In that case, you could add a transparent rectangle on top of the existing content.

Update: my example was written in Java. It shouldn't be a problem for a developer to port this to C#. It's only a matter of changing some lower-cases into upper-cases. E.g. stamper.GetUnderContent(1) instead of stamper.getUnderContent(1), canvas.SaveState() instead of canvas.saveState(), and so on.

Bruno Lowagie
  • 75,994
  • 9
  • 109
  • 165
  • Thank you for your help. I update my code in the question. It's not highlighting. What is the mistake? Please help me. – Karthik Nov 27 '15 at 11:23
  • i take myLocationTextExtractionStrategy class from http://stackoverflow.com/questions/6523243/how-to-highlight-a-text-or-word-in-a-pdf-file-using-itextsharp answer 2 – Karthik Nov 27 '15 at 11:35
  • If you shared your sample PDF, your problem probably could be reproduced. Thus, please share it. Furthermore, *Client have given a pdf. In that, they highlighted text, the highlighted text is displayed in browser* - if you also shared their PDF, we could tell what technique they use and whether that technique is feasible for iText, too. – mkl Nov 27 '15 at 12:15
  • @Karthik *is it possible?* - Yes, see my answer. – mkl Nov 27 '15 at 16:20
  • Bruno Lowagie, thank you for your great help and support. – Karthik Nov 28 '15 at 04:55
  • Please one more help, if the user wrongly highlighted then how to remove yellow color(or change yellow to white)? – Karthik Nov 28 '15 at 08:22
  • Please don't use the comment section to post additional questions. You are asking a *different* question. Also: it is a question that is difficult to answer. It is very easy to remove a Markup annotation, but it is very hard to remove content from a content stream. Only if you control how the content is added (and it seems you do), then you can also remove it. However: you seem to assume that I don't have a job and that I have all the time of the world to answer your questions *for free*. That assumption is wrong. – Bruno Lowagie Nov 30 '15 at 09:20