2

I know that for Python such solution exist already (pypdf). But I hope that someone could suggest some library for C# for this issue.

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
apros
  • 2,848
  • 3
  • 27
  • 31

5 Answers5

3

A commonly used library for manipulating PDF files in .NET is iTextSharp which is a port of the iText library. Here's an example:

class Program
{
    static void Main()
    {
        PdfReader reader = new PdfReader("test.pdf");
        var title = reader.Info["Title"];
        Console.WriteLine(title);
    }
}
Darin Dimitrov
  • 1,023,142
  • 271
  • 3,287
  • 2,928
2

Docotic.Pdf library (Disclaimer: I work for the company) may be used to accomplish the task.

Please take a look at my answer for similar question.

Beyond that the library can do many other things of course.

Bobrovsky
  • 13,789
  • 19
  • 80
  • 130
1

How about this:

http://glenswords.wordpress.com/2007/07/16/extract-the-title-of-a-pdf-using-c/
Randy Minder
  • 47,200
  • 49
  • 204
  • 358
  • +1. You might want to add something between the `<<` and the `/Title` since stuff like `/CreationDate` might show up first. This is definitely cheating and is a dirty rotten hack (and using the solution as written is probably a bad idea), but it has the advantage over the other solutions of not requiring a giant library for a rather tiny feature. – Brian Nov 15 '10 at 16:48
  • I completely agree with Brian as a light solution for tiny feature – apros Nov 15 '10 at 17:14
0

One alternative to iTextSharp is PDFBOX. See CodeProject Tutorial for instructions on using it. This is slightly ugly since you're basically running a C# Java VM, but it's actually really easy to use.

Brian
  • 25,523
  • 18
  • 82
  • 173
0

If by "Title" you mean the Title keyword in the metadata in the Trailer of the PDF, then you can use a number of different tools. iTextSharp will do it, although I don't know the API well enough to give you code.

If you use dotImage, from Atalasoft (where I work, and incidentally, I wrote this code), you can do this:

PdfDocumentMetadata metadata  = PdfDocumentMetadata.FromStream(sourceStream);
Console.WriteLine("Title is \"{0}\"", metadata.Title);

This class also gives you Author, Subject, Keywords, Creator, Producer, CreationDate, ModificationDate, Trapped, and custom fields.

If you're talking about finding the title in XMP embedded in the PDF - well, that's a different beast entirely and I don't yet have support for pulling that out.

plinth
  • 48,267
  • 11
  • 78
  • 120
  • Thank you very much for your posting. Your solution seems to be the most attractive for my issue from commercial library, on my point of view. – apros Nov 15 '10 at 17:18