2

I know there are a lot of other questions on SO about this topic, but I need some more information. It's a two-part question to my requirement: dynamically generate an MS Word document from HTML and prompt for download.

Q1) From what I'm reading it seems that Microsoft.Office.Interop is not designed to be used for server automation since this is just a wrapper around the application and would require Office to be installed on the web server. Is this correct?

I have gotten some of this to work, I get prompted to download, the Word doc saves properly, but the doc shows my markup as the content of the document, not the rendered HTML as the content. From what I've read, it's supposedly possible to export HTML to MS Word simply like this without the need for 3rd party tools or components. I'd also like to avoid the Open XML format as I can't guarantee which version of Word my users have.

Q2) What am I missing here to get my HTML to appear rendered in the MS Word output file? doc.DocumentBody is a string type that contains the entire HTML document.

    public FileStreamResult DownloadDocument(string id)
    {
        /* pseudo-code here to fetch my custom "Document" object from DB */
        Document doc = DocumentService.FindById(id);

        var fileName = string.Format("{0}.doc", doc.Title);
        Response.AddHeader("Content-Disposition", "inline;filename=" + fileName);
        return new FileStreamResult(WordStream(doc.DocumentBody), "application/msword");
    }

    private static Stream WordStream(string body)
    {
        var ms = new MemoryStream();

        byte[] byteInfo = Encoding.ASCII.GetBytes(body);
        ms.Write(byteInfo, 0, byteInfo.Length);
        ms.Position = 0;

        return ms;
    }
Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
MaseBase
  • 800
  • 3
  • 8
  • 31

3 Answers3

4

I have used essentially the same code as you to download html as word documents, and it works fine. I modified my code so that it was the same as yours to test, and it still worked OK, so I wonder if the issue is actually with your HTML.

Have a look at doc.DocumentBody in your debugger, and see if it is valid html.

Is it wrapped in <html><body></body></html>?

I had a test - I think if you leave out the body tags, you'll end up seeing raw html.

StanK
  • 4,750
  • 2
  • 22
  • 46
2
  1. yes, and running Office applications on server without UI is not supported. (Note: "not supported" does not mean it will not work, but simply no guarantees of any kind made).

  2. use File method to return file - http://msdn.microsoft.com/en-us/library/dd505200.aspx, Check out this popular answer - How can I present a file for download from an MVC controller?.

Community
  • 1
  • 1
Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
  • 1
    It shows that returning `File` from the controller returns the FileStreamResult that I'm already using. The problem is NOT that the file doesn't download, it does. The problem is that the contents is the actual HTML markup, rather than a rendered version of the HTML. – MaseBase Feb 28 '12 at 01:47
  • 1
    You code for sending file just feels wrong - there should be no conversion of a string to byte array when you send down whole document as-is (also it could be correct code...). I assumed that "doc" is Word document that you are trying to send down, but after re-reading it looks like you don't have Word document yet - just HTML that you pretend to stream down as Word document. Try specifying "nosniff" header ( http://stackoverflow.com/search?q=nosniff ) which may force your HTML to be opened by Word. – Alexei Levenkov Feb 28 '12 at 02:26
2

Microsoft.Office.Interop is not designed to be used for server automation since this is just a wrapper around the application and would require Office to be installed on the web server. Is this correct?

Yes.

What am I missing here to get my HTML to appear rendered in the MS Word output file?

Well, you need to create a Word document, of course! Word's file format and the HTML file format are different.

There are some very good commercial libraries out there that provide a nice API for generating Office documents programmatically. With Office XML, this is not quite as necessary - it's now much more feasible to generate the XML that Word knows how to read.

Rex M
  • 142,167
  • 33
  • 283
  • 313
  • 1
    Yes, as I mentioned, I was hoping to avoid going the XML route. Also hoped to avoid paying out the nose for commercial licenses, though Aspose.Words looks quite incredible, just very expensive. – MaseBase Feb 28 '12 at 01:43
  • I've also seen several examples where users are saying that it IS possible to get HTML into Word like this, but no luck for me yet. – MaseBase Feb 28 '12 at 01:50
  • 1
    @MaseBase It might not be that bad. Do you know for sure it will be painful? Try pasting one of these HTML strings into Word, saving it, unzipping the docx and examining the source. You may find for 80% of cases there's a simple template you can follow. And yes, Aspose is very powerful and pleasant to use. I have used it on past projects to generate Excel spreadsheets complete with graphs and charts. – Rex M Feb 28 '12 at 02:34