I have a project with requires conversion of MHT documents to PDF format. The documents are large size drawings (C, D, E). The documents are manually loaded into my web application (Apache/Tomcat on Unix AIX) and the requirement is to convert the MHT file on the file to have a more portable file.
I broke the project down into two steps: 1) MHT to HTML extraction (with images) 2) HTML to PDF conversion.
For step 1, thanks to this link How to read or parse MHTML (.mht) files in java , I was able to come up with a java solution for extract and create an HTML file. and it is working well. I had to enhance the code a little bit to work with my environment.
For step 2, things have been a little more difficult. I started looking into the html2doc software http://www.msweet.org/projects.php?Z1 , after spending a few days building the code, I found out it only handles letter and legal size documents. I started looking at wkhtmltopdf http://wkhtmltopdf.org/ , but it's becoming a task on its own to build it. Overall, AIX Unix is not the friendliest environment to build applications in and most options run in other OSs. I'm using the xlc compiler whenever possible. I'd like to have a java solution, but any solution is can just execute would be just fine.