1

We have a ton of pdf documents that each have links that open other pdf documents. When clicking on the links it will open up the other pdf document in the same file system. The problem is that we need to change the name of some of the directories which will require the changing of all the links that point to pdf documents in that directory. We would do this manually but there are literally thousands of links that need to be alter.

We have tried to use iTextSharp and PdfSharp to alter the links but are having a difficult time getting to the right object. Below shows the contents of a sample pdf file with two links. You can see that object 12 is a link that references object 21 and object 21 opens a new window using the reference in object 20. Object 20, of type Filespec contains the path to the linked pdf, Folder-Name/A.pdf. The second link follows the same pattern but uses objects 16, 23, and 22.

12 0 obj<</Type/Annot/P 5 0 R/F 4/C[1 0 0]/Subtype/Link/A 21 0 R/M(D:20130710103035-07'00')/Border[0 0 0]/Rect[144 612 216 630]/NM(QVDTKWKAZGVAAGHJ)/BS 13 0 R>>
endobj
13 0 obj<</W 0/S/S/Type/Border>>
endobj
16 0 obj<</Type/Annot/P 5 0 R/F 4/C[1 0 0]/Subtype/Link/A 23 0 R/M(D:20130710103040-07'00')/Border[0 0 0]/Rect[126 594 216 612]/NM(WFAYQFGTTIESQOKW)/BS 17 0 R>>
endobj
17 0 obj<</W 0/S/S/Type/Border>>
endobj
20 0 obj<</Type/Filespec/F(Folder-Name/A.pdf)/UF(Folder-Name/A.pdf)/Desc()>>
endobj
21 0 obj<</S/GoToR/D[0/Fit]/NewWindow true/F 20 0 R>>
endobj
22 0 obj<</Type/Filespec/F(Folder-Name-2/B.pdf)/UF(Folder-Name-2/B.pdf)/Desc()>>
endobj
23 0 obj<</S/GoToR/D[0/Fit]/NewWindow true/F 22 0 R>>
endobj

How can we use iTextSharp or PdfSharp to change "Folder-Name" and "Folder-Name-2" to some other arbitrary folder path?

Alexis Pigeon
  • 7,423
  • 11
  • 39
  • 44
Nick Olsen
  • 6,299
  • 11
  • 53
  • 75
  • Is this any help? http://stackoverflow.com/a/8141831/231316 – Chris Haas Jul 12 '13 at 13:47
  • That is actually what I started with. I tried to tweak that code you provided to make it work but couldn't. As you can see in the PDF document provided, there is no URI reference. The Annotation references a GoToR object which then references a Filespec object. It is a differnet structure of links than you provided code for. – Nick Olsen Jul 12 '13 at 13:54
  • Unfortunately I'm having some problems producing a PDF that looks like that. I can either create one that links to a file using an `A` action or I can just create a normal `FileSpec` but for the life of me I can't create an action that points to a `FileSpec`. Are you able to provide any sample files? Or could you email them to me, my address is in my profile. – Chris Haas Jul 12 '13 at 15:31
  • @ChrisHaas Just sent you an email with some of the sample documents. As I mentioned the files were created with an application called Blue Beam and not Adobe. That may be why it is a bit different. I'm thinking of just opening the files with a normal StreamReader and using RegEx to find the Filespec objects and replacing the names using that. If we know every link looks like the above example this will work but obviously if there is any variation, we may run into issues. – Nick Olsen Jul 12 '13 at 16:22

1 Answers1

2

In case anyone cares, I was able to use the code linked by Chris Haas in the first comment above but modified it as follows:

foreach (FileInfo file in files)
{                    

    PdfReader reader = default(PdfReader);

    bool linkReplaced = false;

    //Setup some variables to be used later
    reader = new PdfReader(file.FullName);

    int pageCount = reader.NumberOfPages;
    PdfDictionary pageDictionary = default(PdfDictionary);
    PdfArray annots = default(PdfArray);

    //Loop through each page
    for (int i = 1; i <= pageCount; i++)
    {
        //Get the current page
        pageDictionary = reader.GetPageN(i);

        //Get all of the annotations for the current page
        annots = pageDictionary.GetAsArray(PdfName.ANNOTS);

        //Make sure we have something
        if ((annots == null) || (annots.Length == 0))
            continue;

        foreach (PdfObject A in annots.ArrayList)
        {
            //Convert the itext-specific object as a generic PDF object
            PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(A);

            //Make sure this annotation has a link
            if (!AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK))
                continue;

            //Make sure this annotation has an ACTION
            if (AnnotationDictionary.Get(PdfName.A) == null)
                continue;

            string fValue = string.Empty;
            string ufValue = string.Empty;
            string uriValue = string.Empty;

            PdfObject a = AnnotationDictionary.Get(PdfName.A);
            if (a.IsDictionary())
            {
                //Get the ACTION for the current annotation
                PdfDictionary AnnotationAction = (PdfDictionary)a;

                //Test if it is a URI action
                if (AnnotationAction.Get(PdfName.S).Equals(PdfName.URI))
                {
                    uriValue = AnnotationAction.Get(PdfName.URI).ToString();

                    if ((uriValue.IndexOf(findValue, StringComparison.OrdinalIgnoreCase) > -1))
                    {
                        string uriValueReplace = Replace(uriValue, findValue, replaceValue, StringComparison.OrdinalIgnoreCase);

                        //Change the URI to something else
                        AnnotationAction.Put(PdfName.URI, new PdfString(uriValueReplace));                                          
                        linkReplaced = true;
                    }
                }                                               
            }
            else if (a.IsIndirect())
            {
                // Get the indirect reference
                PdfIndirectReference indirectRef = (PdfIndirectReference)a;

                // Get the GoToR type object which is at the document level
                PdfDictionary goToR = (PdfDictionary)reader.GetPdfObject(indirectRef.Number);

                // Get the FileSpec object whic his at the document lelvel
                PdfObject f = goToR.Get(PdfName.F);

                if (f == null || !f.IsIndirect())
                    continue;

                PdfObject fileSpecObject = reader.GetPdfObject(((PdfIndirectReference)goToR.Get(PdfName.F)).Number);

                if (!fileSpecObject.IsDictionary())
                    continue;

                PdfDictionary fileSpec = (PdfDictionary)fileSpecObject;

                fValue = fileSpec.Get(PdfName.F).ToString();
                ufValue = fileSpec.Get(PdfName.UF).ToString();

                if ((fValue.IndexOf(findValue, StringComparison.OrdinalIgnoreCase) > -1) || (ufValue.IndexOf(findValue, StringComparison.OrdinalIgnoreCase) > -1))
                {
                    string fValueReplace = Replace(fValue, findValue, replaceValue, StringComparison.OrdinalIgnoreCase);// fValue.Replace(findValue, replaceValue);
                    string ufValueReplace = Replace(fValue, findValue, replaceValue, StringComparison.OrdinalIgnoreCase);// ufValue.Replace(findValue, replaceValue);

                    // Update the references to the file
                    fileSpec.Put(PdfName.F, new PdfString(fValueReplace));
                    fileSpec.Put(PdfName.UF, new PdfString(ufValueReplace));                                    

                    linkReplaced = true;
                }
            }                                
        }
    }
}   
Nick Olsen
  • 6,299
  • 11
  • 53
  • 75