3

I'm trying to use axiom 1.2.22 with woodstox 6.2.6 to parse an XML document with a doctype. (I'm using OpenJDK 11 but that shouldn't make any difference.) I'm getting the same error that was mentioned in How to ignore DTD parsing in Apache's AXIOM :

Cannot create OMDocType because the XMLStreamReader doesn't support the DTDReader extension

According to https://issues.apache.org/jira/browse/AXIOM-475 that was supposed to be fixed with axiom 1.2.16, but it seems the bug is back again.

Example snippet:

    InputStream is = Test.class.getResourceAsStream("xml-with-dtd.xml");
    OMXMLParserWrapper builder = OMXMLBuilderFactory.createStAXOMBuilder(XMLInputFactory.newFactory().createXMLStreamReader(is));
    OMElement result = builder.getDocumentElement();

Am I using incompatible versions? I also tried using woodstox 5.0.0, which throws the same error. I also verified that it's actually the woodstox XMLInputFactory when using XMLInputFactory.newFactory() that is used. These are the maven dependencies that I use (I've omitted some exclusions related to logging and duplicated classes):

  <dependency>
    <groupId>com.fasterxml.woodstox</groupId>
    <artifactId>woodstox-core</artifactId>
    <version>6.2.6</version>
  </dependency>
  <dependency>
    <groupId>org.codehaus.woodstox</groupId>
    <artifactId>stax2-api</artifactId>
    <version>4.2.1</version>
  </dependency>
  <dependency>
    <groupId>org.apache.ws.commons.axiom</groupId>
    <artifactId>axiom-impl</artifactId>
    <version>1.2.22</version>
  </dependency>
  <dependency>
    <groupId>org.apache.ws.commons.axiom</groupId>
    <artifactId>axiom-api</artifactId>
    <version>1.2.22</version>
  </dependency>

Update: Looks a lot like the axiom code tries to determine a DTDReader class to use from a configuration property. Unfotunately setting the property DTDReader.PROPERTY in the XMLInputFactory to any value results in the following stack trace:

Exception in thread "main" java.lang.IllegalArgumentException: Unrecognized property 'org.apache.axiom.ext.stax.DTDReader'
    at com.ctc.wstx.api.CommonConfig.reportUnknownProperty(CommonConfig.java:167)
    at com.ctc.wstx.api.CommonConfig.setProperty(CommonConfig.java:158)
    at com.ctc.wstx.api.ReaderConfig.setProperty(ReaderConfig.java:35)
    at com.ctc.wstx.stax.WstxInputFactory.setProperty(WstxInputFactory.java:400)
hwbllmnn
  • 39
  • 6
  • One quick note: if I read it correctly, the issue reference is wrt SJSXP, JDK-bundled Stax implementation, not Woodstox. So its fixing would not change things for Woodstox use. You may still want to verify that no other Stax implementations are in the classpath. It is unfortunate that the exception does not show implementation/provider class to rule out that possibility. – StaxMan Jul 02 '21 at 22:18
  • That's true, but I verified manually that the XMLStreamReader in question is a com.ctc.wstx.sr.ValidatingStreamReader (IntelliJ tells me it's from the woodstox-core 6.2.6 jar as expected). That's why I'm so baffled: as mentioned in the links I mentioned I should not even run into the bug even using the JDK-bundled Stax implementation as I'm using axiom 1.2.22, and even the suggested workaround to use woodstox in a recent version is not working. – hwbllmnn Jul 04 '21 at 14:48

1 Answers1

0

I'm not sure why it didn't work when I tried it with woodstox 5, but this little patch against axiom 1.2.22 solves the problem at least for woodstox 6.2.6:

Index: axiom-api/src/main/java/org/apache/axiom/util/stax/dialect/StAXDialectDetector.java
===================================================================
--- axiom-api/src/main/java/org/apache/axiom/util/stax/dialect/StAXDialectDetector.java (revision 1891409)
+++ axiom-api/src/main/java/org/apache/axiom/util/stax/dialect/StAXDialectDetector.java (working copy)
@@ -274,6 +274,7 @@
                     return new Woodstox4Dialect(version.getComponent(1) == 0 && version.getComponent(2) < 11
                             || version.getComponent(1) == 1 && version.getComponent(2) < 3);
                 case 5:
+                case 6:
                     return new Woodstox4Dialect(false);
                 default:
                     return null;

Update:

Version 1.3.0 of axiom also fixes the problem.

hwbllmnn
  • 39
  • 6