How can I get a byte[]
value from a .doc or word file?
I've tried to use an input stream and convert it to byte[]
, but when I write it back to a .doc file, it would be corrupt.
Are there any better ways?
How can I get a byte[]
value from a .doc or word file?
I've tried to use an input stream and convert it to byte[]
, but when I write it back to a .doc file, it would be corrupt.
Are there any better ways?
File file = new File("filename");//filename should be with complete path
FileInputStream fis = new FileInputStream(file);
byte[] b = new byte[ (int) file.length()];
fis.read(b);
Here is the code of ReadDoc/docx.java: This will read a dox/docx file and print its content to the console. you can customize it your way. To run this program you need apache's poi jar...
This will program can give you array of string...
import java.io.*;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
public class ReadDocFile {
public static void main(String[] args) {
File file = null;
WordExtractor extractor = null ;
try {
file = new File("c:\\New.doc");
FileInputStream fis=new FileInputStream(file.getAbsolutePath());
HWPFDocument document=new HWPFDocument(fis);
extractor = new WordExtractor(document);
String [] fileData = extractor.getParagraphText();
for(int i=0;i<fileData.length;i++){
if(fileData[i] != null)
System.out.println(fileData[i]);
}
}
catch(Exception exep){}
}
}