8

Is there a .Net open source library to convert the word dococument to HTML to display inside the webpage.

I know several tools to convert word docs to html files, but my requirements is to convert the doc(either from the file or just extracted text) to HTML on the fly in the ASP.Net application.

I found the converting-a-word-document-into-usable-html-in-php PHP library do the same thing, is there any similar tool in .net?

Community
  • 1
  • 1
RameshVel
  • 64,778
  • 30
  • 169
  • 213
  • 1
    Why don't you convert to a file and then read the HTML file? – Eduardo Molteni Oct 19 '10 at 18:34
  • 1
    ya that's a final option if there is no way, currently we are storing doc as blob in db, so it would be convenient to convert this to HTML string than storing blob to file system as doc and initiate word interop to save as html and then read it from app... – RameshVel Oct 20 '10 at 04:38

2 Answers2

2

You just want to convert a *.doc file to HTML? Is saving it as a a HTML file an option?

There is the standard .SaveAs method which has the option to save as HTML:

wdFormatHTML Saves all text and formatting with HTML tags so that the resulting document can be viewed in a Web browser.

from: MSDN SaveAs Method

An example tutorial on how to use the method to convert .doc to a different format you can find here: How to convert DOC into other formats using C#.

If you have *.docx files instead of *.doc files it is even easier because you get to use the OpenXML API like explained on MSDN here: Manipulating Word 2007 Files with the Open XML Format API (Part 1 of 3). And if you get the XML of the Word file you can of course output it to any format (HTML) you want.

Dennis G
  • 21,405
  • 19
  • 96
  • 133
0

Convert your doc files to pdf with the help of JOdConverter and OpenOffice

See How to convert ppt to images in Ruby? for reference

and then use pdftohtml (http://pdftohtml.sourceforge.net) a utility which converts PDF files into HTML.

You will get amazing results.

Community
  • 1
  • 1
Satish
  • 2,015
  • 1
  • 14
  • 22