24

How can I validate an XML file against a DTD that is stored locally as a file? The XML file does not have any DOCTYPE declaration (or may have one that should then be overridden). I had a look at this thread but besides the fact they are using .NET I doubt that this is a good solution.

Any input appreciated!

Community
  • 1
  • 1
Simon
  • 1,643
  • 2
  • 17
  • 23

3 Answers3

26

In an ideal world, you'd be able to validate using a Validator. Something like this:

SchemaFactory schemaFactory = SchemaFactory
    .newInstance(XMLConstants.XML_DTD_NS_URI);
Schema schema = schemaFactory.newSchema(new File(
    "xmlValidate.dtd"));
Validator validator = schema.newValidator();
validator.validate(new StreamSource("xmlValidate.xml"));

Unfortunately, the Sun implementation (at least, as of Java 6) does not include support for creating a Schema instance from a DTD. You might be able to track down a 3rd party implementation.

Your best bet may be to alter the document to include the DTD before parsing using some other mechanism.


You can use a transformer to insert a DTD declaration:

TransformerFactory tf = TransformerFactory
    .newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(
    OutputKeys.DOCTYPE_SYSTEM, "xmlValidate.dtd");
transformer.transform(new StreamSource(
    "xmlValidate.xml"), new StreamResult(System.out));

...but this does not seem to replace an existing DTD declaration.


This StAX event reader can do the job:

  public static class DTDReplacer extends
      EventReaderDelegate {

    private final XMLEvent dtd;
    private boolean sendDtd = false;

    public DTDReplacer(XMLEventReader reader, XMLEvent dtd) {
      super(reader);
      if (dtd.getEventType() != XMLEvent.DTD) {
        throw new IllegalArgumentException("" + dtd);
      }
      this.dtd = dtd;
    }

    @Override
    public XMLEvent nextEvent() throws XMLStreamException {
      if (sendDtd) {
        sendDtd = false;
        return dtd;
      }
      XMLEvent evt = super.nextEvent();
      if (evt.getEventType() == XMLEvent.START_DOCUMENT) {
        sendDtd = true;
      } else if (evt.getEventType() == XMLEvent.DTD) {
        // discard old DTD
        return super.nextEvent();
      }
      return evt;
    }

  }

It will send a given DTD declaration right after the document start and discard any from the old document.

Demo usage:

XMLEventFactory eventFactory = XMLEventFactory.newInstance();
XMLEvent dtd = eventFactory
    .createDTD("<!DOCTYPE Employee SYSTEM \"xmlValidate.dtd\">");

XMLInputFactory inFactory = XMLInputFactory.newInstance();
XMLOutputFactory outFactory = XMLOutputFactory.newInstance();
XMLEventReader reader = inFactory
    .createXMLEventReader(new StreamSource(
        "xmlValidate.xml"));
reader = new DTDReplacer(reader, dtd);
XMLEventWriter writer = outFactory.createXMLEventWriter(System.out);
writer.add(reader);
writer.flush();

// TODO error and proper stream handling

Note that the XMLEventReader could form the source for some other transformation mechanism that performed validation.


It would be much easier to validate using a W3 schema if you have that option.

McDowell
  • 107,573
  • 31
  • 204
  • 267
  • Thank you very much for your extensive answer, that really helps me a lot. I will have a look into converting the DTD to a W3 schema, as I can use the Validator of Sun then. – Simon Jul 08 '09 at 23:20
  • my XML file does not have any DOCTYPE declaration. and I'm parsing file using SAXParser in Android. DTD generated by myself. How can I validate an XML file using my DTD, in SAX parsing? – Khushbu Shah Nov 12 '11 at 11:52
  • @Khushbu - it would be better if you asked a new question. – McDowell Nov 12 '11 at 17:11
  • I have already asked [question](http://stackoverflow.com/questions/8091513/how-to-apply-validation-of-local-dtd-file-to-xml-file-in-java) but I have not get proper answer. – Khushbu Shah Nov 15 '11 at 03:54
  • WstxParsingException / FileNotFoundException is thrown, if the discarded DTD system id references to a file that does not exist. Add `inFactory.setProperty(XMLInputFactory.SUPPORT_DTD, false)` after inFactory instantiation. – haba713 Nov 19 '16 at 21:55
  • Thanks. I needed to add DTD to 2000 XMLs and this helped. – ulab May 19 '17 at 15:45
3

im pretty sure the stuff aforementioned will work..

Thanks for your help, but what if no DOCTYPE has been specified at all? The EntityResolver would not help me in that case, would it? – Simon Jul 8 '09 at 6:34

@Bluegene: What are you validating against if no DOCTYPE? – J-16 SDiZ Jul 8 '09 at 7:12

Against my own DTD. I just want to make sure the XML I receive conforms to my DTD, not just any DTD the sender specifies. – Simon Jul 8 '09 at 23:09

if the problem is you want it to be validated against your dtd rather than the authors you should ensure that there is clear documentation that details the doctype, and what must be in the xml file

zachary
  • 31
  • 1
1

You have to implement the EntityResolver, checkout this example.

J-16 SDiZ
  • 26,473
  • 4
  • 65
  • 84
  • Thanks for your help, but what if no DOCTYPE has been specified at all? The EntityResolver would not help me in that case, would it? – Simon Jul 08 '09 at 06:34
  • @Bluegene: What are you validating against if no DOCTYPE? – J-16 SDiZ Jul 08 '09 at 07:12
  • Against my own DTD. I just want to make sure the XML I receive conforms to _my_ DTD, not just any DTD the sender specifies. – Simon Jul 08 '09 at 23:09