2

I have a project which requires creating a Word (.doc) file with certain formatting and certain data fetched from my database. I want to output that file to user which he would edit and the upload the file back it back to the server. After which I want to perform the following conversion on the uploaded file.

  • .doc to .pdf (Intended to be downloaded and viewed on web)
  • .doc to .html (Intended for free text search on web)

I want to achieve this without opening the Open Office port. The earlier version was doing this but the port opened had tendency of crashing when the users are more. So I want to avoid doing that. The Open Office and the OS both were re-installed on other machines and tried in different ways but the OO port crashed every time the users increased.

Is there any other way to achieve this conversion? Continuing with this is not possible due to the crashing.

This is the host machine:

  • Tomcat server on Linux (RedHat 64 bit)
  • The application is developed in Java (JSP and Servlets)
  • The backend is Oracle

All users have:

  • A Linux machine, mostly 32bit Fedora or Suse

Any help is appreciated.

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
Sangeet Menon
  • 9,555
  • 8
  • 40
  • 58
  • 1
    When all users have a linux machine, why would you create a windows file??? – eckes Feb 11 '11 at 12:37
  • The previous versions of the program used to output .sxw files...but the users might carry those files to some other machine and then do editing which might be a windows box....and the .sxw file wont open there... the file being used is not in our hands...the user are provided a linux m/c but its upto them to use a windows... that is the reason i turned to a universal file which opens in both .sxw and .doc – Sangeet Menon Feb 11 '11 at 14:23
  • I meant one could open the file both in linux and windows.... – Sangeet Menon Feb 11 '11 at 14:31

1 Answers1

1

You could use http://poi.apache.org/ for handling the actual .doc files. For PDF, there are a number of PDF libraries available as well. The catch is that many are not free, but here is a list of open source PDF libraries: http://java-source.net/open-source/pdf-libraries

Here is a discussion on Word to HTML. Convert Word doc to HTML programmatically in Java

Community
  • 1
  • 1
dmcnelis
  • 2,913
  • 1
  • 19
  • 28
  • hi dmcnelis...the problem with poi.apache is that it cannot create a file from scratch...that is we need an empty doc file in the server which is used everytime to create a doc file....that brings into picture a problem of a single file being accessed by multiple request to generate doc file which again a problem...... – Sangeet Menon Feb 11 '11 at 14:32
  • Why not make a copy of the file when you need to create a new doc, and then remove it? – dmcnelis Feb 11 '11 at 15:48
  • To do that also i first need to access the file which when concurrently accessed by many users at a same time would create an issue... – Sangeet Menon Feb 11 '11 at 18:06
  • What about also using http://code.google.com/p/java2word/ to create the document on the fly, then save it with a UUID, then open that file with POI? – dmcnelis Feb 11 '11 at 20:54
  • The java2word generates xml content, which can only be viewed when opened in MS. Word....But the server I am using is linux fedora....so that api is of no use as when the file is opened in open office it shows the xml contents with its tag.....It works fine when the user and the host servers are in microsoft..... – Sangeet Menon Feb 15 '11 at 05:24
  • Sorry, I didn't realize that about Java2Word. – dmcnelis Feb 16 '11 at 14:54
  • No one has an idea HOW TO ACHIEVE this conversion without opening the port???....I think for the time being i need to move ahead with the normal OO port method.... – Sangeet Menon Mar 04 '11 at 05:24