1

I try to read file .docx with help java api Apache POI. I use:

public static String view(String nameDoc){
    String text = null;
    try{
        XWPFDocument docx = new XWPFDocument(
                new FileInputStream(nameDoc));
        XWPFWordExtractor we = new XWPFWordExtractor(docx);
        text = we.getText();
        we.close();
        docx.close();
    }catch (Exception e){
        e.printStackTrace();
    }
    return text;
}

In this case i get only a text of file, but my file includes a text, table, pictures... How can i get full content of file?

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
Oleg1n
  • 19
  • 5

1 Answers1

0
String contents = "";

     try {  
         System.out.println("Starting the test");  
         POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream("D:/Resume.doc"));  
         HWPFDocument doc = new HWPFDocument(fs);  
         WordExtractor we = new WordExtractor(doc);  
         OutputStream file = new FileOutputStream(new File("D:/test.pdf")); 
         PdfWriter parser = PdfWriter.getInstance(doc, file);  
         parser.parse(); 
         PDDocument pdfDocument = parser.getPDDocument(); 
         PDFTextStripper stripper = new PDFTextStripper(); 
         contents = stripper.getText(pdfDocument); 
         pdfDocument.close();

     } catch (Exception e) {
        logger.error(e.getMessage());
     }

In contents you get full content of file.

Parth Solanki
  • 3,268
  • 2
  • 22
  • 41