8

Does the anyone know a .Net component to convert PDF to Word or RTF programatically? I don't want to use OCR and Adobe dependent solutions.

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
John
  • 89
  • 1
  • 1
  • 2
  • Welcome to StackOverflow! Your question has already been asked here, see this link: http://stackoverflow.com/questions/2192400/does-anyone-know-of-a-way-to-easily-convert-a-pdf-to-a-docx-format-programmatical – CharlesB Apr 20 '11 at 12:04
  • 2
    And about 4,000 other places. This question gets asked about three times a week. Try doing a search for "PDF", "Word" and "conversion". – Cody Gray - on strike Apr 20 '11 at 12:05

4 Answers4

5

I tried several libraries:

Among all of them I liked PDF Focus .NET best of all, and I will explain why:

  1. They try to keep the structure of the document EDITABLE, so that when I will try to continue editing the text, the paragraph will be smoothly prolonged. Other libraries are trying to do a "minimalistic" approach by inserting absolute positioned shapes, so that if you continue editing the text, it will overlap with the next piece of text.
  2. They do all their best to recognize tables, so that tables in the output document will be REAL TABLES, but not a collection of shapes and texts with absolute positioning (as produced by other libraries).

A customer of ours is evaluating now different libraries, and I will recommend PDF Focus .NET first of all.

P.S. I AM NOT INVOLVED IN ANY KIND OF RELATIONSHIP WITH THIS SOFTWARE PRODUCER. As a former .NET developer I simply see a high quality components which really work fine.

Ihor B.
  • 1,275
  • 14
  • 17
4

Use PDF Focus.

Nice and easy.

EDIT: And also

How to convert DOC into other formats using C#

http://dotnetf1.blogspot.com/2008/07/convert-word-doc-into-pdf-using-c-code.html

Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
2

You need something like GemBox.Document. It's a simple .NET component that enables you to manipulate and convert all kinds of document files.

Evale
  • 31
  • 1
-1

You should have read this: C# and PDF. There are methods to convert, like beforementioned PDF Focus but be warned: it is buggy, and crashy process. PDF is not intended to be PC-readable.

Community
  • 1
  • 1
  • We are preparing the new engine for PDF Focus .Net right now, this will be version 2.0, completely improved. It will appear soon. – Maxim Sautin Apr 20 '11 at 12:12
  • @Maximus That is a great job (I looked into PDF, I know). The flaw is in PDF standart itself. e-mail me, and I will find you several files, that would break you component. –  Apr 22 '11 at 05:29
  • 1
    Nowadays (2016) [**PDF Focus .Net**](http://sautinsoft.com/products/pdf-focus/index.php) became a strong and the component in the prime of life :). We are with pleasure ready to look at any complex PDF files at support@sautinsoft.com. – Maxim Sautin Feb 09 '16 at 13:09