2

is there a way to remove some text from header and footer in PDF using iText 7 in c#?

I found this code snippet from iText site, but apparently a license is need:

public void manipulatePdf(String dest) throws IOException {
    //Load the license file to use cleanup features
    LicenseKey.loadLicenseFile(System.getenv("ITEXT7_LICENSEKEY") + "/itextkey-multiple-products.xml");
    PdfDocument pdfDoc = new PdfDocument(new PdfReader(SRC), new PdfWriter(dest));
    List cleanUpLocations = new ArrayList();
    cleanUpLocations.add(new PdfCleanUpLocation(1, new Rectangle(97, 405, 383, 40), Color.GRAY));
    PdfCleanUpTool cleaner = new PdfCleanUpTool(pdfDoc, cleanUpLocations);
    cleaner.cleanUp();
    pdfDoc.close();
}

Link: https://kb.itextpdf.com/home/it7kb/faq/how-to-remove-text-from-a-pdf

Anyone has some code sample on removing text from header and footer?

PS: if someone knows how to edit and save the same file using iText will be great help

EDIT 1 I am adding the text myself in the PDF. I use the example found here in this page: https://kb.itextpdf.com/home/it7kb/faq/how-to-add-text-as-a-header-or-footer.

The question now is, how to remove the text I entered in the pdf after it has been saved to the file? I want to open the pdf back again and remove that text.

Edit2: This post does not answer my question. The code inside manipulatePdf method is exactly as the one I have pasted above initially. Classes like PdfCleanUpTool are not found in the community edition.

Burre Ifort
  • 653
  • 3
  • 15
  • 30
  • 1
    There is a payed and a community edition of iText. It may well be that the knowledgebase mainly focuses on the payed version (which would explain, why it includes the license check). That does not mean the community edition does not support the feature. You just need to find out if it does. – Fildor Jul 14 '20 at 11:22
  • I coun't find more info from the community edition. – Burre Ifort Jul 14 '20 at 11:23
  • [This example in java](https://github.com/itext/i7js-examples/blob/develop/src/main/java/com/itextpdf/samples/sandbox/parse/RemoveContentInRectangle.java) doesn't have the license check ... so, I'd simply try it out and see if it works. – Fildor Jul 14 '20 at 11:39
  • thanks Fitdor. I saw it actually before and I was trying to find the c# version. I will give it a try. – Burre Ifort Jul 14 '20 at 11:43
  • Take a look here: [iTextSharp 5.5.13.1](https://www.nuget.org/packages/iTextSharp/) – Maciej Los Jul 14 '20 at 12:07
  • Does this answer your question? [How to remove headers and footers from PDF file using iText in Java](https://stackoverflow.com/questions/27884283/how-to-remove-headers-and-footers-from-pdf-file-using-itext-in-java) – Maciej Los Jul 14 '20 at 12:09
  • The Knowledge Base is written with the paying customers in mind, hence the check for the license key. But the code examples work exactly the same for the Community Edition, you just leave out the license key and all of your PDF files will have "AGPL version" in the Producer Line, other than that, no functional differences. – Amedee Van Gasse Jul 14 '20 at 13:51
  • In other words, just drop that one line about the license key. The code will still work, in AGPL mode. – Amedee Van Gasse Jul 14 '20 at 13:52
  • _"PS: if someone knows how to edit and save the same file using iText will be great help"_ -> this is a NEW question, please ask in a new question. That being said: you have to write to a temporary document first (which does not have to be a file, it can also be in memory); then you close the original file; and finally you write your resulting document to the file (either rename the temporary file or write the document in memory to file). See https://stackoverflow.com/a/47435793/766786 – Amedee Van Gasse Jul 14 '20 at 13:56
  • Or this: https://stackoverflow.com/questions/8141448/modifying-an-existing-pdf-without-creating-an-new-pdf-file – Amedee Van Gasse Jul 14 '20 at 13:58
  • @MaciejLos The question is about iText 7. iText 5 for .NET (formerly known as iTextSharp) does not have the PDF redaction features of pdfSweep. – Amedee Van Gasse Jul 14 '20 at 14:00
  • @MaciejLos your linked answer does not answer the question. – Amedee Van Gasse Jul 14 '20 at 14:01
  • @Maciej Los I can't downgrade to 5.5.... as another guy has used iText to convert from HTML To PDF. And just installing it does not answer my question. – Burre Ifort Jul 14 '20 at 14:29

4 Answers4

1

In my case I used this:

PdfDocument pdf = new PdfDocument(new PdfReader(SRC), new PdfWriter(dest));
        ICleanupStrategy cleanupStrategy = new RegexBasedCleanupStrategy(new Regex(textToRemove)).SetRedactionColor(ColorConstants.PINK);
        PdfAutoSweep autoSweep = new PdfAutoSweep(cleanupStrategy);
        autoSweep.CleanUp(pdf);
        pdf.Close();

The end result is that it removes the text and shows a rectangular background color. You can choose the background color. Not sure if it can be set to transparent.

You need to install: https://www.nuget.org/packages/itext7.pdfsweep/. The code above will remove text from the whole document, but in your case you can be more specific, to remove from header and footer only.

PS: taken from here: https://itextpdf.com/en/products/itext-7/pdf-redaction-pdfsweep

Imir Hoxha
  • 1,674
  • 6
  • 31
  • 56
0

In your code sample, pdfSweep add-on is used which is available both in AGPL license offering and commercial license offering. To use the add-on in AGPL mode, if that is applicable to your project, just install pdfSweep from NuGet (https://www.nuget.org/packages/itext7.pdfsweep/) and use the code without loading the license:

PdfDocument pdfDoc = new PdfDocument(new PdfReader(SRC), new PdfWriter(dest));
List cleanUpLocations = new ArrayList();
cleanUpLocations.add(new PdfCleanUpLocation(1, new Rectangle(97, 405, 383, 40), Color.GRAY));
PdfCleanUpTool cleaner = new PdfCleanUpTool(pdfDoc, cleanUpLocations);
cleaner.cleanUp();
pdfDoc.close();
Alexey Subach
  • 11,903
  • 7
  • 34
  • 60
0

In general there are two ways to "remove text" from a PDF.

First of all, not all human readable characters in a pdf are produced by text draw commands using a particular font. Some of them are pixel images that happen to have pixels that look remarkably like words. Others are line art (particularly when the text has gone through a skew transformation in some art program. It could even be a clipping path applied to something.

  1. Cover it up. Drop a rectangle that matches the background color. This way has flaws, but is easy. You can still select/copy/paste the text. You can still search for it. It's still there, just covered. This works on any kind of "text".

  2. Remove the text. If you know where on the page it will be, you could eliminate everything that exists solely in that area (probably a rectangle). IIRC iText has a way of enumerating all the drawing commands in an existing PdfContentByte, though I couldn't tell you the name off the type of my head. You basically just remove everything in that rectangle. Depending on how iText is set up these days, you may have to instead copy everything that isn't in that rectangle instead. Awkward but effective.

Mark Storer
  • 15,672
  • 3
  • 42
  • 80
  • I just want to remove a text which has been added using this code https://kb.itextpdf.com/home/it7kb/faq/how-to-add-text-as-a-header-or-footer. That is it. – Burre Ifort Jul 17 '20 at 07:51
  • @BurreIfort you might want to add that information to your question. There is a big difference between removing arbitrary texts and texts you know exactly how they were added. – mkl Jul 17 '20 at 08:38
  • yes, you are correct. I will add it to my original post. – Burre Ifort Jul 17 '20 at 08:42
  • thanks. Is there a way to remove the text without using some kind of cover, like they are doing with the rectangle as it places some color on top of anther background and that does not look good. I merely want to remove the text. – Burre Ifort Jul 17 '20 at 16:52
0

The code snippet works correctly even without the license key check line, here is my code snippet which is working perfectly also downloaded ( itext7,itext7.pdfhtml)

note: SRC and DEST should be different , if same path is given then you would end up getting error.

using iText;

iText.Kernel.Pdf.PdfDocument pdfDoc = new iText.Kernel.Pdf.PdfDocument(new iText.Kernel.Pdf.PdfReader(SRC), new iText.Kernel.Pdf.PdfWriter(dest));

int pages = pdfDoc.GetNumberOfPages();
                                                            
List<iText.PdfCleanup.PdfCleanUpLocation> cleanUpLocations = new List<iText.PdfCleanup.PdfCleanUpLocation>();
//// red
//iText.Kernel.Colors.DeviceRgb d = new iText.Kernel.Colors.DeviceRgb(245, 66, 66);
////white
 
iText.Kernel.Colors.DeviceRgb d = new iText.Kernel.Colors.DeviceRgb(255, 255, 255);

for (int i = 1; i <= pages; i++)
{                                                               
    iText.PdfCleanup.PdfCleanUpLocation pclean = new iText.PdfCleanup.PdfCleanUpLocation(i, new iText.Kernel.Geom.Rectangle(233, 5, 129, 5), d);
    cleanUpLocations.Add(pclean);
}

iText.PdfCleanup.PdfCleanUpTool cleaner = new iText.PdfCleanup.PdfCleanUpTool(pdfDoc, cleanUpLocations);

cleaner.CleanUp();
pdfDoc.Close();
Ramil Aliyev 007
  • 4,437
  • 2
  • 31
  • 47