0

How to read doc, docx file into .NET with C#.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
haresh chande
  • 23
  • 2
  • 3

5 Answers5

3

I see you used the asp.net tag. You should not use the automation API (COM Interop) to run Microsoft Office products from ASP.NET or any other server application. The Office products are made to be run from the desktop - with a user interface. They don't work properly in a server scenario, and additionally, there are licensing issues.

Use Aspose.Words for .NET or some other such technology instead. They are designed to be used in a server environment.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
  • So... there is no library from Microsoft designed for server environment? – judehall Apr 07 '12 at 20:11
  • 1
    What do you mean "libraries"? Most of the .NET Framework is designed for use in either a client or server environment. But the Microsoft Office Automation APIs are designed for **automating Microsoft Office**, which is a set of ***desktop*** applications. – John Saunders Apr 08 '12 at 00:21
  • Yes ok I understand better now :) – judehall Apr 14 '12 at 12:44
1

Aspose.Words for .NET is a commercial library that allows you to do exactly this. From the website:

Using Aspose.Words for .NET, developers can easily open and save DOC, OOXML, RTF, WordprocessingML, HTML, MHTML, TXT and OpenDocument documents.

Jørn Schou-Rode
  • 37,718
  • 15
  • 88
  • 122
1

Generally a COM interop is used to interface with office documents.

Here's an example on MSDN on creating an excel file, it should give you an idea.

http://msdn.microsoft.com/en-us/library/ms173186(VS.80).aspx

Also, Visual Studio 2010 along with .net 4.0 will include more dynamic language features which lend themselves to doing office com interop, read more here

http://blogs.msdn.com/samng/archive/2009/06/16/com-interop-in-c-4-0.aspx

And here's a video

http://msdn.microsoft.com/en-us/vcsharp/ee460939.aspx

TJB
  • 13,367
  • 4
  • 34
  • 46
  • -1: you cannot safely use the automation API from an ASP.NET or other server-based program. – John Saunders Oct 23 '11 at 20:20
  • ah, hadn't noticed the asp.net tag, good catch. I suppose I had also confused the office API with the automation aspect, but now it makes more sense. – TJB Oct 24 '11 at 20:43
1

you can simply use the RichTextBox control to read .rtf and .doc files using RichTextBox.Load method

Ali Tarhini
  • 5,278
  • 6
  • 41
  • 66
0

Microsoft provide a free set of interop assemblies for interacting with the various Office file formats in .NET, the download locations differ depending on which version of Office you are using but a Google search for "Microsoft Office Primary Interop Assemblies" will yield the links for various versions from MSDN such as this one for Office 2007.

As for how to open a Word document (doc or docx) using these interops the following snippet shows how to open a Word document:

_Application WordApp = new Microsoft.Office.Interop.Word.Application();

  object WordFile = "C:\\SomeDoc.doc";
  object RdOnly = false;
  object Visible = true;
  object Missing = System.Reflection.Missing.Value;
  Document Doc = WordApp.Documents.Open(ref WordFile, ref Missing, ref RdOnly, ref Missing, ref Missing, 
                                        ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, 
                                        ref Missing, ref Visible, ref Missing, ref Missing, ref Missing, 
                                        ref Missing);

From there you can use Doc to access various parts of the document.

lee-m
  • 2,269
  • 17
  • 29