I'm having an issue reading an existing pdf for regular expression matches, then extracting those pages to a new pdf. I've run into some issues with this as a whole.
I've decided to clear my head and start again from scratch. I'm able to take a 3 page pdf and extract the pages individually into a new file using this code:
static void Main(string[] args)
{
string srcFile = @"C:\Users\steve\Desktop\original.pdf";
string dstFile = @"C:\Users\steve\Desktop\result.pdf";
PdfReader reader = new PdfReader(srcFile);
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileStream(dstFile, FileMode.Create));
document.Open();
for (int page = 1; page <= reader.NumberOfPages; page++)
{
PdfImportedPage importedPage = copy.GetImportedPage(reader, page);
copy.AddPage(importedPage);
}
document.Close();
}
This code works because the PdfCopy instance is OUTSIDE the for loop. The issue I'm running into is that the only way I can seem to get the code (for converting to text and finding regex matches) is to place that functionality (to include the PdfCopy instance) inside the for loop.
Here's the code from my initial question: C# iTextSharp - Code overwriting instead of appending pages