4

I am searching from last two days but did not find any thing.

My requirement is to create a document viewer in my web application (C#.Net) and I don't want to use any third party tool for this. Can I convert the files in image or PDF or in any common formate which can be easly render on web page. I also can not use Introp object.

Any help will be highly appreciated

Mikhail
  • 9,186
  • 4
  • 33
  • 49
Anubrij Chandra
  • 1,592
  • 2
  • 14
  • 22
  • Refer this links:- Link-1:http://stackoverflow.com/questions/6866110/how-to-convert-doc-to-jpg-in-net http://stackoverflow.com/questions/680948/converting-a-multiple-page-pdf-to-a-single-image – Vinoth Raj Mar 06 '17 at 06:19
  • 3
    There is no support for this in .NET framework, which is why simplest approach would be to use a 3rd party tool, for example [GemBox.Document](https://www.gemboxsoftware.com/document/examples/c-sharp-vb-net-word-pdf-library/801). If you don't want to do that then you'll have to create this yourself and I must say it's truly quite complex and time consuming task to achieve. In short you can use "System.IO.Packaging" to read your document (of DOCX format) and you can use "System.Drawing" to create an image, but you'll need to implement complete pagination and rendering engine by yourself. – Mario Z Mar 07 '17 at 09:05
  • @MarioZ: Thanks for your explanation ,I understand that Its is quite complex to do. you gave me short reference too start – Anubrij Chandra Mar 08 '17 at 09:58

4 Answers4

4

You mention in one of your comments that you'd like to write all the code yourself but don't know where to start. Here's how I would go about it...

First, you'll need to familiarize yourself with the Microsoft Office Format specification. You can find that here (there's a link to the technical specification). Office documents are actually a .zip file with an XML file inside along with any binary data representing attachments. Just renamed a .docx file as .zip and you'll be able to open it up and see the XML and any other supporting documents inside (same is true for xlsx, etc...).

Then you'll need to become intimately familiar with either PDF or HTML, as your job now will be to convert the various Office document structure into PDF or HTML structure, being sure to respect page layout, margins, order, etc...

As others have said, this is a large task which is why third party tools exist today. Also, each third party toolset has it's limitation as this is really hard to "get right" in all situations and there will be edge cases that work for one document and not another (because maybe they didn't use Microsoft Word to save the .docx, maybe they used OpenOffice and OpenOffice interpreted the standard slightly differently...)

Corith Malin
  • 1,505
  • 10
  • 18
3

If you cannot use COM/Interop technologies in your solution, you can take a look at the specialized 3rd party options. I see that you prefer not to use them, however, there are no existing built-in solutions in the .NET Framework. Check out my answer in a similar thread that describes how to accomplish exactly the same task using 3rd party libraries (for example, DevExpress, since I have experience with it). In addition, take a look at the Documents demo, where you can see how to create images/thumbnails from different types of MS Office documents.

Community
  • 1
  • 1
Mikhail
  • 9,186
  • 4
  • 33
  • 49
2

I believe what you need is an intermediate representation of the documents which can be converted into an image for the viewer to display.

Lets me try to explain with the below diagram:

enter image description here

Akhilesh
  • 121
  • 5
  • yes you understand my need very well, you did a good job to prepare flow document.but the main problem is that i don't know how to convert the document to image without any third party tool, I want to write all code by my self which a third party tool have.. but don't know how to start – Anubrij Chandra Apr 01 '17 at 19:14
  • @AnubrijChandra, have you considered to find open source library and figure out how it did the job by reading the code? Just a suggestion, hope it helps. – Muhammad Reza Irvanda Apr 02 '17 at 18:06
  • Well doing everything yourself might just take a lot of time and you may even end up reinventing the wheel. But that's for you to decide. – Akhilesh Apr 04 '17 at 08:40
  • Sorry, I hit the return key before completing my comment In order to create images, you will have to rely on the Graphics object available in the System.Drawing namespace. For example: `Image img = new Bitmap(100, 200); Graphics g = Graphics.FromImage(img); g.DrawString("Sample Text", new Font("Consolas", 10), Brushes.Black, 10, 10 ); img.Save("out", ImageFormat.Jpeg); // Instead of specifying the file name, you can use the response stream and write the image` – Akhilesh Apr 04 '17 at 08:54
0

You can use tools like smallpdf or OfficeToPDF to do that. Just integrate them into your application.

Small PDF(https://smallpdf.com/library-detail)

officetopdf (https://officetopdf.codeplex.com/)

Shivaji Varma
  • 690
  • 13
  • 24