8

I have seen a couple threads about this, but am not getting very straight answers in my searching. I have a web application that needs to take in doc, docx, xls, xlsx files and convert them into PDF. Right now we have a process that uses the Microsoft.Office.Interop.Word library which opens up the document, prints it to a PS file, then GPL GhostScript converts the PS file into a PDF.

This process works OKish, but overall there are several steps in there, and this was originally developed years ago when it was even harder to find a PDF print driver and interface it. In the spirit of updating, I am looking at trying to find a possible better way to handle this. The main reason is that in our application we use a web service call to perform the elevated operation of the conversion process, with newer windows server and in particular for Window 7 for development, the ability to open the file even with impersonation is causing some issues with the Interop library.

All of this I'm sure can be figured out and ironed out, but I was wondering if there is a newer and better way to go about this. I have looked into PDF995, but am not finding a great way to programmatically go in and print a file directly to a PDF. The code they provide is in C++ and I am not finding how to mimic the calls in C#.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
Justin Rassier
  • 898
  • 12
  • 26
  • 3
    You should not ever call Office Interop from an ASP.NET application or any other server application. See [Considerations for server-side Automation of Office](http://support.microsoft.com/kb/257757) – John Saunders Oct 05 '11 at 17:19
  • Interesting article. Thank you for pointing it out. Is it really still such a difficult task to get this type of functionality working? Something like Aspose isn't out of the question, but I would think there could be some method existing out there not requiring such a hefty purchase. – Justin Rassier Oct 05 '11 at 17:44
  • The Office Interop code, and the Office applications it talks to, were designed for a single-user, interactive environment. That just plain doesn't work properly in a multi-threaded, multi-user environment. Belive me, I've found out the hard way, you don't want to go there. It will cost you more to use Interop and pretend it works, then "fix" it, and "fix" it, and "fix" it... – John Saunders Oct 05 '11 at 17:58
  • Understood. It is good information to have now so thank you. That still leaves me with the question of options to complete this task. Without the interop I see that we have a recommendation of Aspose. Are there other options out there? Some other means to still use PDF drivers without first using the interop library? Any help and will be appreciated – Justin Rassier Oct 05 '11 at 19:18
  • It's fine to run on a server, you just want a background process that watches your data for documents that need to be converted. the problem with running directly from asp pages is that multiple requests can both try to invoke the com object at the same time. – FlavorScape Mar 13 '13 at 19:50
  • I've had no issues running an automation console app that hits the web api for status "unconverted" then it flags it as converted so the clients know to load the thumbnails now etc... this ensures that it only every processes one doc/powerpoint at a time. – FlavorScape Mar 13 '13 at 19:52

3 Answers3

7

If you're looking for a "free" solution, I think you might have the only viable option out there, but like John said, server-side interop is typically not a good idea. We've used the .NET Aspose components with a great deal of success. This is a pure managed solution with no interop or office required.

Steve Danner
  • 21,818
  • 7
  • 41
  • 51
4

EDIT: In light of this article provided by John Saunders, Considerations for server-side Automation of Office, the code below should not be used for server-side development purposes.

Here's the code for converting a Docx to PDF using Interop. Hopefully you can figure out how to do the other documents using this as a starting point.

private void DocxToPdf(String sourcePath, String destPath) {

        //Change the path of the .docx file and filename to your file name.

        object paramSourceDocPath = sourcePath;
        object paramMissing = Type.Missing;
        var wordApplication = new Microsoft.Office.Interop.Word.Application();
        Document wordDocument = null;

        //Change the path of the .pdf file and filename to your file name.

        string paramExportFilePath = destPath;
        WdExportFormat paramExportFormat = WdExportFormat.wdExportFormatPDF;
        bool paramOpenAfterExport = false;
        WdExportOptimizeFor paramExportOptimizeFor =
            WdExportOptimizeFor.wdExportOptimizeForPrint;
        WdExportRange paramExportRange = WdExportRange.wdExportAllDocument;
        int paramStartPage = 0;
        int paramEndPage = 0;
        WdExportItem paramExportItem = WdExportItem.wdExportDocumentContent;
        bool paramIncludeDocProps = true;
        bool paramKeepIRM = true;
        WdExportCreateBookmarks paramCreateBookmarks =
            WdExportCreateBookmarks.wdExportCreateWordBookmarks;
        bool paramDocStructureTags = true;
        bool paramBitmapMissingFonts = true;
        bool paramUseISO19005_1 = false;

        try {
            // Open the source document.
            wordDocument = wordApplication.Documents.Open(
                ref paramSourceDocPath, ref paramMissing, ref paramMissing,
                ref paramMissing, ref paramMissing, ref paramMissing,
                ref paramMissing, ref paramMissing, ref paramMissing,
                ref paramMissing, ref paramMissing, ref paramMissing,
                ref paramMissing, ref paramMissing, ref paramMissing,
                ref paramMissing);

            // Export it in the specified format.
            if (wordDocument != null)
                wordDocument.ExportAsFixedFormat(paramExportFilePath,
                    paramExportFormat, paramOpenAfterExport,
                    paramExportOptimizeFor, paramExportRange, paramStartPage,
                    paramEndPage, paramExportItem, paramIncludeDocProps,
                    paramKeepIRM, paramCreateBookmarks, paramDocStructureTags,
                    paramBitmapMissingFonts, paramUseISO19005_1,
                    ref paramMissing);
        }
        catch (Exception ex) {
            // Respond to the error
            System.Windows.Forms.MessageBox.Show(ex.Message);
        }
        finally {
            // Close and release the Document object.
            if (wordDocument != null) {
                wordDocument.Close(ref paramMissing, ref paramMissing,
                    ref paramMissing);
                wordDocument = null;
            }

            // Quit Word and release the ApplicationClass object.
            if (wordApplication != null) {
                wordApplication.Quit(ref paramMissing, ref paramMissing,
                    ref paramMissing);
                wordApplication = null;
            }

            GC.Collect();
            GC.WaitForPendingFinalizers();
            GC.Collect();
            GC.WaitForPendingFinalizers();
        }
    }
DJ Quimby
  • 3,669
  • 25
  • 35
  • 1
    You forgot to mention that Office Interop should be avoided in a server application. – Darin Dimitrov Oct 05 '11 at 17:24
  • 2
    -1 for suggesting he implement many difficult-to-find bugs. Don't Ever Use Office Interop in a Server Application. – John Saunders Oct 05 '11 at 17:25
  • 1
    @JohnSaunders I didn't suggest he do or not do anything. I happened to have some code kicking around that did some of what he wanted to do so I posted it. You guys can tell him what not to do in your own comments. – DJ Quimby Oct 05 '11 at 17:27
  • Did he want to have many difficult-to-find bugs? I don't see where he asked for that. – John Saunders Oct 05 '11 at 17:28
  • @DJQuimby, posting code without taking into consideration the context could be dangerous. When I do code reviews I see many people employing code they found on the internet that is totally inappropriate to their specific scenario. – Darin Dimitrov Oct 05 '11 at 17:30
  • 2
    @DarinDimitrov Fair enough, I'm sure that's how I came across this bit a few years ago. Still, a negative comment like that could have easily been replaced with a better explanation that we all could learn from. we aren't all at the same experience level, its important to remember that. – DJ Quimby Oct 05 '11 at 17:35
  • We would learn more from you deleting your answer, which would, by the way, remove the downvote. This code is very dangerous, because the next person to find this answer may _actually use it_. – John Saunders Oct 05 '11 at 17:59
  • 2
    I have to say I appreciate @DJ Quimby and the fact he posted something. I understand it may not be a very safe solution, but I myself just learned all the potential pitfalls. I am glad he offered up something that technically does what I asked. I am also glad that there is the rest of the community to educate and explain why this may not be a good solution. – Justin Rassier Oct 05 '11 at 19:23
  • 1
    I'd say it is more useful to leave this answer and the warnings. Otherwise someone might find this code elsewhere and not be warned about the dangers. Keep the answer as well as the arguments against using it and let people use their own judgement. – Ivy Oct 05 '11 at 20:07
  • In light of the other comments, I'll remove my downvote if the OP will post a prominent "**don't do this in a server**" comment at the start of the answer. – John Saunders Oct 05 '11 at 20:27
  • @JohnSaunders I read the article you posted at the top, you downvoted it for good reasons, reasons that until today I was unaware of. I will gladly put that comment on this post to better help future users, you don't need to upvote it. – DJ Quimby Oct 05 '11 at 20:33
  • It's fine to run on a server, you just want a background process that watches your data for documents that need to be converted. the problem with running directly from asp pages is that they can both try to invoke the com object at the same time. – FlavorScape Mar 13 '13 at 19:50
2

Syncfusion Essential PDF can be used to convert office documents to PDF. The library can be used from Windows Forms, WPF, ASP.NET Webforms, ASP.NET MVC applications

The whole suite of controls is available for free (commercial applications also) through the community license program if you qualify. The community license is the full product with no limitations or watermarks.

Note: I work for Syncfusion.

Davis Jebaraj
  • 403
  • 6
  • 10