1

I get the following errors when I try to read the content of the doc file using apache poi 3.17:

java.lang.IllegalArgumentException: The document is really a OOXML file
    at org.apache.poi.hwpf.HWPFDocumentCore.verifyAndBuildPOIFS(HWPFDocumentCore.java:123)
    at org.apache.poi.hwpf.HWPFDocument.(HWPFDocument.java:169)
    at project12.Home12.button1ActionPerformed(Home12.java:312)
    at project12.Home12.access$300(Home12.java:24)
    at project12.Home12$3.actionPerformed(Home12.java:113)
    at java.awt.Button.processActionEvent(Button.java:409)
    at java.awt.Button.processEvent(Button.java:377)
    at java.awt.Component.dispatchEventImpl(Component.java:4889)
    at java.awt.Component.dispatchEvent(Component.java:4711)
    at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:758)
    at java.awt.EventQueue.access$500(EventQueue.java:97)
    at java.awt.EventQueue$3.run(EventQueue.java:709)
    at java.awt.EventQueue$3.run(EventQueue.java:703)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:80)
    at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:90)
    at java.awt.EventQueue$4.run(EventQueue.java:731)
    at java.awt.EventQueue$4.run(EventQueue.java:729)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:80)
    at java.awt.EventQueue.dispatchEvent(EventQueue.java:728)
    at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:201)
    at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116)
    at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105)
    at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
    at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
    at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)

I have included xmlbeans-2.6.0.jar and dom4j-1.6.1.jar but the problem persists.

sam alex
  • 13
  • 1
  • 8
  • Please show the code that generated this exception, along with the filename that you're trying to open. Is it a .doc or a .docx file? – rgettman Apr 18 '18 at 16:39
  • It seems that you are trying to parse an .docx as a .doc. Is your document really a .doc? When using POI there are different classes to parse .doc and .docx. – davidbuzatto Apr 18 '18 at 16:39
  • @rgettman i'll post the code now – sam alex Apr 18 '18 at 16:44
  • @davidbuzatto can't we parse the docx file using apache poi ? – sam alex Apr 18 '18 at 16:45
  • 2
    Yes you can, but you need to use the appropriate classes to do this. HWPF is used to .doc and XWPF is used to .docx. Take a look here: https://poi.apache.org/document/index.html. Your example is using the XWPF infrastructure, but you are trying to open an .doc file. – davidbuzatto Apr 18 '18 at 16:51
  • @davidbuzatto i can successfully read the doc files using the same program now. well how can i select a docx file or doc file one at a time ? – sam alex Apr 18 '18 at 16:57
  • 2
    This answer does a bit more than look at file extensions to decide doc or docx: https://stackoverflow.com/questions/47483011/how-to-judge-if-the-file-is-doc-or-docx-in-poi?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa –  Apr 18 '18 at 18:02

1 Answers1

2

This happens when you try to read .doc files when in fact they are in docx format.

In such cases use the XWPFDocument class instead of HWPFDocument.

Cheers! :)