1

I'm trying to test extraction of a single page from a PDF document, but I'm getting a NullReferenceException whenever I try.

var document = new Document();
var stream = new MemoryStream();
var writer = PdfWriter.GetInstance(document, stream);

document.Open();
document.Add(new Paragraph("This is page 1."));
document.NewPage();
document.Add(new Paragraph("This is page 2."));
document.Close();

var copystream = new MemoryStream();
var copy = new PdfCopy(document, copystream);
copy.Open();
var reader = new PdfReader(stream.ToArray());
var page = copy.GetImportedPage(reader, 2);
copy.AddPage(page);
copy.Close(); // code throws exception here

I've tried adding writer.CloseStream = false, but I still end up with the same NullReferenceException:

Object reference not set to an instance of an object.
   at iTextSharp.text.Document.get_Left()
   at iTextSharp.text.pdf.PdfDocument.SetNewPageSizeAndMargins()
   at iTextSharp.text.pdf.PdfDocument.NewPage()
   at iTextSharp.text.pdf.PdfDocument.Close()
   at iTextSharp.text.pdf.PdfCopy.Close()
   at iTextTest.Controllers.HomeController.Index() in line 41
Jason L
  • 1,812
  • 2
  • 22
  • 43
  • possible duplicate of [What is a NullReferenceException and how do I fix it?](http://stackoverflow.com/questions/4660142/what-is-a-nullreferenceexception-and-how-do-i-fix-it) – Servy Jan 05 '15 at 15:46
  • 1
    Well that seems to be a bug in iTextSharp. They may want to add null handling there and throw the proper exception, like "No margin set" or whatever the root cause is. [Browse the source: the `Left` property does `return pageSize.GetLeft(marginLeft);`](http://sourceforge.net/p/itextsharp/code/HEAD/tree/trunk/src/core/iTextSharp/text/Document.cs), where `pageSize` presumably is `null`. – CodeCaster Jan 05 '15 at 15:47
  • 1
    Are you sure that you've used the PDF library you're using correctly? According to your code you are creating a PdfCopy from a closed document. Try opening the document for reading after closing it (keeping it open might leave the position in the wrong place) and using that for the copy. – Steve Lillis Jan 05 '15 at 15:52
  • @SteveLillis The issue with not calling `document.Close` is that `PdfReader` won't work correctly because, and I'm assuming here, `document.Close()` adds the xref table. – Jason L Jan 05 '15 at 15:58
  • @CodeCaster So, should I set the page margins myself or something? – Jason L Jan 05 '15 at 15:59
  • Well it seems to be doing that in the parameterless `Document()` constructor, so I wouldn't know... Maybe you can download and compile the source to debug from there, or read it until you see where it goes wrong. Make sure you're looking at the right version of the code. – CodeCaster Jan 05 '15 at 16:00
  • 1
    You use the `Document document` initially to generate the first document, close it, and then use it again for the second one. @SteveLillis already mentioned this. Yes, the close is necessary to finish the first document. But you need a not-closed instance to create the second document. Thus, add a `document = new Document();` before `var copystream = new MemoryStream();` – mkl Jan 05 '15 at 20:06
  • I agree with @mkl. The `Document` object is **only** for working with new documents. It is an abstraction that allows you to more easily add things to a PDF but once you close it, the abstractions are converted into the more obscure PDF syntax. On your second pass, your `PdfCopy` is in fact creating a brand new document, the fact that you are importing pages is incidental. – Chris Haas Jan 06 '15 at 01:56
  • @mkl @ChrisHaas I see. I think I interpreted the documentation incorrectly. I had assumed that the `Document` used in the `PdfCopy` constructor was the copy source rather than an abstraction for working with new documents. – Jason L Jan 06 '15 at 14:08

2 Answers2

1

Please change your code like this:

var document = new Document();
var stream = new MemoryStream();
var writer = PdfWriter.GetInstance(document, stream);

document.Open();
document.Add(new Paragraph("This is page 1."));
document.NewPage();
document.Add(new Paragraph("This is page 2."));
document.Close();

document = new Document(); // this is the line you need to add
var copystream = new MemoryStream();
var copy = new PdfCopy(document, copystream);
copy.Open();
var reader = new PdfReader(stream.ToArray());
var page = copy.GetImportedPage(reader, 2);
copy.AddPage(page);
copy.Close(); // code throws exception here

You are reusing the document object you used to create a new document from scratch. That document instance is already closed. When you use the document in the context of PdfCopy, you need a new Document instance.

Bruno Lowagie
  • 75,994
  • 9
  • 109
  • 165
0

I have reviewed the source for PdfDocument as can be found here: http://sourceforge.net/p/itextsharp/code/HEAD/tree/trunk/src/core/iTextSharp/text/pdf/PdfDocument.cs#l2334

PdfDocument assigns the value of private field nextPageSize to field pageSize at the start of the method SetNewPageSizeAndMargins. To stop nextPageSize being null (and therefore causing your pageSize to be set to null and triggering a NullReferenceException when it is next accessed) call SetPageSize on the document before closing the copy.

To keep the default page size, call SetPageSize as follows:

document.SetPageSize(document.PageSize);

This is most likely an oversight by the developers of the PdfDocument class, which I suspect is meant to be setting a default value for nextPageSize and isn't.

Steve Lillis
  • 3,263
  • 5
  • 22
  • 41
  • I put in `copy.SetPageSize(document.PageSize)` after `copy.AddPage(page)` but I'm still getting the same exception on `copy.Close()`. – Jason L Jan 05 '15 at 16:27
  • My mistake. Please put it on the document. It's when the copy closes the document that the document throws the exception. Updated my answer. – Steve Lillis Jan 05 '15 at 16:30
  • I added `document.SetPageSize(document.PageSize)` before `document.Close()` but I'm still getting an exception thrown. I stepped through the source code, and it seems the issue is indeed with `nextPageSize` as it's still `null`. – Jason L Jan 05 '15 at 17:12
  • *This is most likely an oversight by the developers of the PdfDocument class, which I suspect is meant to be setting a default value for nextPageSize and isn't.* - `document.PageSize` is initialized in non-closed instances, so no oversight here. Probably critical code could check whether or not the document is closed, though. – mkl Jan 06 '15 at 08:27
  • But closing a PDFDocument without calling SetPageSize causes the default pageSize to be overwritten with null. That seems like an unexpected behaviour to me, especially as not all implementations of a document will do the same? – Steve Lillis Jan 06 '15 at 09:44
  • A closed document is closed and, therefore, shall not be used anymore. – mkl Jan 06 '15 at 10:13
  • Ah, I see. Thank you for clarifying. Perhaps an InvalidOperationException or similar would be useful for developers trying to use a closed document, similar to when a developer attempts to use an already disposed object? It would help prevent confusion such as this. – Steve Lillis Jan 06 '15 at 10:19