How to read doc, docx file into .NET with C#.

- 160,644
- 26
- 247
- 397

- 23
- 2
- 3
-
http://stackoverflow.com/questions/215620/how-to-load-ms-word-document-in-c-net – Jørn Schou-Rode Jan 31 '10 at 17:25
5 Answers
I see you used the asp.net tag. You should not use the automation API (COM Interop) to run Microsoft Office products from ASP.NET or any other server application. The Office products are made to be run from the desktop - with a user interface. They don't work properly in a server scenario, and additionally, there are licensing issues.
Use Aspose.Words for .NET or some other such technology instead. They are designed to be used in a server environment.

- 160,644
- 26
- 247
- 397
-
So... there is no library from Microsoft designed for server environment? – judehall Apr 07 '12 at 20:11
-
1What do you mean "libraries"? Most of the .NET Framework is designed for use in either a client or server environment. But the Microsoft Office Automation APIs are designed for **automating Microsoft Office**, which is a set of ***desktop*** applications. – John Saunders Apr 08 '12 at 00:21
-
Aspose.Words for .NET is a commercial library that allows you to do exactly this. From the website:
Using Aspose.Words for .NET, developers can easily open and save DOC, OOXML, RTF, WordprocessingML, HTML, MHTML, TXT and OpenDocument documents.

- 37,718
- 15
- 88
- 122
Generally a COM interop is used to interface with office documents.
Here's an example on MSDN on creating an excel file, it should give you an idea.
http://msdn.microsoft.com/en-us/library/ms173186(VS.80).aspx
Also, Visual Studio 2010 along with .net 4.0 will include more dynamic language features which lend themselves to doing office com interop, read more here
http://blogs.msdn.com/samng/archive/2009/06/16/com-interop-in-c-4-0.aspx
And here's a video

- 13,367
- 4
- 34
- 46
-
-1: you cannot safely use the automation API from an ASP.NET or other server-based program. – John Saunders Oct 23 '11 at 20:20
-
ah, hadn't noticed the asp.net tag, good catch. I suppose I had also confused the office API with the automation aspect, but now it makes more sense. – TJB Oct 24 '11 at 20:43
you can simply use the RichTextBox control to read .rtf and .doc files using RichTextBox.Load
method

- 5,278
- 6
- 41
- 66
Microsoft provide a free set of interop assemblies for interacting with the various Office file formats in .NET, the download locations differ depending on which version of Office you are using but a Google search for "Microsoft Office Primary Interop Assemblies" will yield the links for various versions from MSDN such as this one for Office 2007.
As for how to open a Word document (doc or docx) using these interops the following snippet shows how to open a Word document:
_Application WordApp = new Microsoft.Office.Interop.Word.Application();
object WordFile = "C:\\SomeDoc.doc";
object RdOnly = false;
object Visible = true;
object Missing = System.Reflection.Missing.Value;
Document Doc = WordApp.Documents.Open(ref WordFile, ref Missing, ref RdOnly, ref Missing, ref Missing,
ref Missing, ref Missing, ref Missing, ref Missing, ref Missing,
ref Missing, ref Visible, ref Missing, ref Missing, ref Missing,
ref Missing);
From there you can use Doc to access various parts of the document.

- 2,269
- 17
- 29
-
Oh god ! there is nothing i hate more than the "System.Reflection.Missing.Value" !! – Hannoun Yassir Jan 30 '10 at 10:51
-
1-1: you cannot safely use the automation API from an ASP.NET or other server-based program. – John Saunders Oct 23 '11 at 20:20