Questions tagged [hwpf]

Apache POI - HWPF - Java API to Handle Microsoft Word Files

HWPF is the name of a port of the Microsoft Word 97(-2007) file format to pure Java as part of the project. It also provides limited read only support for the older Word 6 and Word 95 file formats.

See the Apache POI homepage for details.

46 questions
8
votes
2 answers

Apache POI HWPF - problem in convert doc file to pdf

I am currently working Java project with use of apache poi. Now in my project I want to convert doc file to pdf file. The conversion done successfully but I only get text in pdf not any text style or text colour. My pdf file looks like a black &…
user370305
  • 108,599
  • 23
  • 164
  • 151
7
votes
1 answer

Inserting a base64 encoded image into xsl-fo file while using apache-poi

I am using Apache POI to convert .doc to .fo using the WordToFoConverter class, I have converted the images in the word file to base64, but how do i append it to the xsl-fo code generated by apache-poi? Consider the sample fo file generated by…
user3683297
  • 156
  • 1
  • 8
6
votes
2 answers

How to read shapes group as an image from Word document(.doc or .docx) using apachePOI?

I have a simple requirement to extract all the Images and Diagrams drawn in MS Word file. I am able to extract only images but not group of shapes(like Use Case Diagram or Activity Diagram). I want to save all the Diagrams as Image. I have used…
Karsan
  • 269
  • 3
  • 14
5
votes
2 answers

How do I read word document with bold and italic formatting by using POI

I am using Apache POI. I am able to read text from a doc file by using "org.apache.poi.hwpf.extractor.WordExtractor" Even fetched the tables by using "org.apache.poi.hwpf.usermodel.Table" But please suggest me, how can I fetch bold/italic…
Sudeep nayak
  • 418
  • 1
  • 5
  • 12
3
votes
4 answers

How to use Apache HWPF to extract text and images out of a DOC file

I downloaded the Apache HWPF. I want to use it to read a doc file and write its text into a plain text file. I don't know the HWPF so well. My very simple program is here: I have 3 problems now: Some of packages have errors (they can't find apache…
Hamed
3
votes
2 answers

How can I use custom java library (from github)

I want to use a custom library from github (https://github.com/ddoleye/java-hwp) How can I import it and use it? I want to import and use the library in file_read.java file
user3704652
  • 303
  • 4
  • 6
  • 16
3
votes
1 answer

Relace HWPFDocument paragraph text using java results strange output

I require to replace a HWPFDocument paragraph text of .doc file if it contains a particular text using java. It replaces the text. But the process writes the output text in a strange way. Please help me to rectify this issue. Code snippet…
Sherin
  • 349
  • 1
  • 15
3
votes
1 answer

word to FO conversion using hwpf apache poi

How do i convert a .doc file to FO using hwpf.converter.WordToFo class? I have tried searching but i could only get a word to html conversion. I have also read the WordToFO manual at the apache-poi site, but could not get it. Convert Word to HTML…
user3683297
  • 156
  • 1
  • 8
3
votes
1 answer

Formatting text using Apache POI 3.8 (HWPF)

I am trying to insert the following text in the document using Apache POI 3.8: [Bold][Normal], but the output document has this: [Bold][Normal] The code: import org.apache.poi.hwpf.HWPFDocument; import org.apache.poi.hwpf.usermodel.*; import…
Frolovskij
  • 31
  • 3
3
votes
1 answer

Java: parsing ms-word document using POI/HWPF

I have a ms-word document (MS-Office 2003; non-xml). Within this document there is a string associated with a bookmark. Furthermore, the word document contains word-macros. My goal is to read the document with java, replace the string associated…
user136200
  • 51
  • 1
  • 5
2
votes
0 answers

Convert DOC [HWPFDocument] to pdf [with font, Table and images] using java

converting doc file to pdf I am using the following code : POIFSFileSystem fs = null; Document Pdfdocument = new Document(); fs = new POIFSFileSystem(new FileInputStream(srcFile)); HWPFDocument doc = new…
KishanCS
  • 1,357
  • 1
  • 19
  • 38
2
votes
1 answer

Change font type of CharacterRun

I have a document (.doc) that I've generated using Apache POI with HWPF and I want to change the font type. I'm guessing that the place to change it would be on the character runs inside each paragraph. CharacterRun has methods such as .setBold()…
spencer.sm
  • 19,173
  • 10
  • 77
  • 88
2
votes
3 answers

Edit Microsoft-office .doc file in java using Apache POI

I'm writing java code to achieve the followings. 1.Read given Microsoft-office document(.doc) file. 2.Search for given string in the file. 3.Delete the given String located in any place. 4.Insert or replace any given string at specified…
nagesh
  • 307
  • 2
  • 10
  • 22
2
votes
1 answer

Apache POI jar has no hwpf package

I have downloaded the poi api jar files from this link's first mirror link,which is suggested. After downloaded I saw that org.apache.poi.hwpf package is not present there. Actually my work is completely depends on that API. So can anybody please…
Chandra Sekhar
  • 18,914
  • 16
  • 84
  • 125
1
vote
1 answer

Apache Poi - how to remove all the links from Word Documents

I want to remove all the hyperlinks of a Word document and keep the text. I have these two methods to read word documents with doc and docx extensions. private void readDocXExtensionDocument(){ File inputFile = new File(inputFolderDir,…
zekifh
  • 169
  • 1
  • 2
  • 10
1
2 3 4