convert a pdfreader to a pdfdocument

Question

Can anyone tell me how to convert a PdfReader object into a PdfDocument ?

I have read a disk file and converted to a memorystream but I need it as a PdfDocument for other methods in my C# program.

I'm converting an application to use iTextSharp instead of PdfSharp.

MemoryStream pdfstream = new MemoryStream();

/* Convert the attachment to an byte array */
byte[] pdfarray = (byte[])dr["Data"];
/* Write the attachment into the memory */
pdfstream.Write(pdfarray, 0, pdfarray.Length);
/* Set the memorystream to the beginning */
pdfstream.Seek(0, System.IO.SeekOrigin.Begin);

/* Open the pdf document */
PdfSharp.Pdf.PdfDocument document = PdfSharp.Pdf.IO.PdfReader.Open(pdfstream, PdfDocumentOpenMode.Modify);
//iTextSharp.text.Document doc1 = iTextSharp.text.pdf.PdfReader.GetStreamBytes(
//ITS.pdf.PdfReader rdr = ITS.pdf.PdfReader(

string filename = DateTime.Now.Ticks.ToString() + "_" + dr["AttachmentName"].ToString();
string path = Path.Combine(FolderName, filename);

document.Save(path);

I'm not sure if it's still the case, but a comment over here - http://stackoverflow.com/a/2554230/855363 - suggests that it's not possible. — Snixtor, Feb 12 '13 at 10:27
@Snixtor the comment is only partially correct nowerdays --- iText now **does** contain a framework for extracting text and images from existing PDFs, but the result is **not** a ready `PdfDocument` but instead a sequence of letter groups and bitmaps with positioning data, no information of paragraphs etc. anymore. For user1423958, therefore, the consequence is the same: It is not possible (unless he invests quite some time in developing heuristics to build those missing structures from the text and image bits). — mkl, Feb 12 '13 at 11:00
@user1423958 You probably should explain what requirements you need to be fulfilled. While you won't be able to create a `PdfDocument` from some `PdfReader`, you might actually only require a `PdfStamper` or `PdfCopy` instance. — mkl, Feb 12 '13 at 11:04
Hi,Yes sorry, should have been more specific. My method loads an existing PDF doc from a varbinary(max) column into a memorystream via a byte[] array. It then creates a PdfSharp.Pdf.PdfDocument from this which is modified later on in the porgram. So perhaps I don;t need to go near a PdfReader...? — Dave, Feb 13 '13 at 09:06

Paddy · Answer 1 · 2013-02-12T14:55:44.433

2

I think you can do something like this (note code not run or tested, might need a tweak):

using (MemoryStream ms = new MemoryStream())
{
    Document doc = new Document(PageSize.A4, 50, 50, 15, 15);

    PdfWriter writer = PdfWriter.GetInstance(doc, ms);

    using (var rdr = new PdfReader(filePath))
    {
        PdfImportedPage page;

        for(int i = 1; i <= rdr.PageCount; i++)
        {
            page = writer.GetImportedPage(templateReader, i)

            writer.DirectContent.AddTemplate(page, 0, 0);

            doc.NewPage();
        }
    }
}

This will read in the PDF page by page and output it to your document.

edited Feb 12 '13 at 14:55

answered Feb 12 '13 at 12:23

Paddy

33,309
15
79
114

Downvoted because examples like this cause an enormous amount of support questions such as: "the page size of copied content is different from the original document", "all the annotations are gone after copying", etc... People should read the documentation: http://www.manning.com/lowagie2/samplechapter6.pdf PdfStamper and PdfCopy are the classes that should be used in cases like this. See also http://stackoverflow.com/questions/14770942/itext-pdf-merge-document-overflow-outside-pdf-text-truncated-page-and-not-di/14771651 "I wonder why so many people find the wrong examples first..." – Bruno Lowagie Feb 16 '13 at 15:06
1

@Bruno - then maybe you should provide an answer, rather than just downvoting. This answers the question - pdf document from a pdf reader. I don't know the context of his request, or what he is doing with it, this might be valid. – Paddy Feb 18 '13 at 11:42
4

P.S. When you start charging a license for your product, providing better documentation for the money might be helpful, rather than just selling books. – Paddy Feb 18 '13 at 11:43
1

We just spent 2k on the product, so I thunk it's fair enough to ask a few "dumb" questions. If you re-worked your book with C# examples I would (& others I suspect) would definately buy it. – Dave Feb 18 '13 at 15:58
@user1423958 if you have a license, you are allowed to post questions to our ticketing system. There's really no need for you to post questions in public. Why would you want an answer from amateurs if you have access to professional iText support engineers? As for iTextSharp examples, we've invested in a C# port of the book samples as well as the tutorial examples: http://sourceforge.net/p/itextsharp/code/HEAD/tree/trunk/book/iTextExamplesWeb/iTextExamplesWeb/iTextInAction2Ed/ and http://sourceforge.net/p/itextsharp/code/HEAD/tree/tutorial/signatures/ – Bruno Lowagie Feb 18 '13 at 21:22
@Paddy The answer is provided in the free sample chapter: use PdfStamper or PdfCopy. Your answer wasn't correct: http://article.gmane.org/gmane.comp.java.lib.itext.general/64259 See also http://article.gmane.org/gmane.comp.java.lib.itext.general/64260 The documentation was supervised by Manning. Are you questioning their quality control? Forgive me my frustration, but what about this point of view: http://news.gmane.org/gmane.comp.java.lib.itext.general – Bruno Lowagie Feb 18 '13 at 21:26
1

@Bruno - all those sites are unfortunately blocked within my corporate network... – Paddy Feb 19 '13 at 08:17
1

@BrunoLowagie - to clarify - I like the product, it works well, but as I use google to find solutions to most of my programming queries, as do a lot of people, your documentation can be a bit hard to get through to find the answer you need. If you want to stop people asking this question, then answer it here - the top google results for "convert a PdfReader object into a PdfDocument itext" is this question. If I can't solve a C# question, I don't as MS, I ask it here... – Paddy Feb 19 '13 at 08:25
1

@BrunoLowagie - Which ticketing system? Do you mean on here or another site? Please can you provide a url? – Dave Feb 19 '13 at 14:03
If you bought a license, you were asked for 3 addresses of people who can post support questions. Upon registration in our ticketing system, these 3 people receive a mail with credentials to access this support system. We could disclose that URL, but you can't use it without an account. You have me worried now: if you don't have such an account, are you sure you're a customer? Who did you buy the license from? – Bruno Lowagie Feb 19 '13 at 15:02
@Paddy So I should vote to close the question because it's duplicate. A search for PdfStamper and PdfCopy results in many good answers: PdfStamper: http://stackoverflow.com/questions/8318281/using-pdfstamper-to-add-a-rectangle PdfCopy: http://stackoverflow.com/questions/6360722/itextsharp-problem-concatenating-pdf-documents/6369827#6369827 – Bruno Lowagie Feb 19 '13 at 15:06
@BrunoLowagie - Yes we did have to nominate three emails once we had paid, the contact is Hilde Goosens at itexfpdf.com. Please tell me this is legit? – Dave Feb 20 '13 at 09:21
Yes, Hilde Goossens works for us. If you've sent those mails to Hilde, you should have received your account info. Did you register mail addresses with her? – Bruno Lowagie Feb 20 '13 at 12:19
1

@BrunoLowagie - Poor form old mate! "Several iText Engineers are actively supporting the project on StackOverflow... /iText" Are you not one of these engineers? – Rusty Nail Jan 13 '17 at 23:33

convert a pdfreader to a pdfdocument

1 Answers1