5

In my XML file I have some entities such as ’

So I have created a DTD tag for my XML document to define these entities. Below is the Java code used to read the XML file.

SAXBuilder builder = new SAXBuilder();
URL url = new URL("http://127.0.0.1:8080/sample/subject.xml");        
InputStream stream = url.openStream();
org.jdom.Document document = builder.build(stream);

Element root = document.getRootElement();

Element name = root.getChild("name");
result = name.getText();
System.err.println(result);

How can I change the Java code to retrieve a DTD over HTTP to allow the parsing of my XML document to be error free?

Simplified example of the xml document.


<main>
  <name>hello &lsquo; world &rsquo; foo  &amp; bar </name> 
</main>
anonymous
  • 2,294
  • 5
  • 23
  • 27
  • 1
    Entities must be declared before they can be used. If you are using entity references that have not been declared(either within the file or with a reference to an external DTD), you have an invalid XML file. – Mads Hansen Feb 10 '11 at 03:47
  • 1
    Indeed. My problem is injecting the path to my DTD into the xml at runtime, as the dtd is not referenced inside the xml document. The DTD resides on a remote server which is accessible via http. – anonymous Feb 11 '11 at 14:04

1 Answers1

3

One way to do this would be to read the document and then validate it with the transformer:

import java.net.URL;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;

public class ValidateWithExternalDTD {
    private static final String URL = "http://127.0.0.1:8080/sample/subject.xml";
    private static final String DTD = "http://127.0.0.1/YourDTD.dtd";

    public static void main(String args[]) {
        try {
            DocumentBuilderFactory factory=                   DocumentBuilderFactory.newInstance();
            factory.setValidating(true);
            DocumentBuilder builder = factory.newDocumentBuilder();

            // Set the error handler
            builder.setErrorHandler(new org.xml.sax.ErrorHandler() {                
                public void fatalError(SAXParseException spex)
                        throws SAXException {
                    // output error and exit
                    spex.printStackTrace();
                    System.exit(0);
                }

                public void error(SAXParseException spex)
                        throws SAXParseException {
                    // output error and continue
                    spex.printStackTrace();
                }

                public void warning(SAXParseException spex)
                        throws SAXParseException {
                    // output warning and continue
                    spex.printStackTrace();
                }
            });

            // Read the document
            URL url = new URL(ValidateWithExternalDTD.URL);
            Document xmlDocument = builder.parse(url.openStream());
            DOMSource source = new DOMSource(xmlDocument);

            // Use the tranformer to validate the document
            StreamResult result = new StreamResult(System.out);                     
            TransformerFactory tf = TransformerFactory.newInstance();
            Transformer transformer = tf.newTransformer();
            transformer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, ValidateWithExternalDTD.DTD);
            transformer.transform(source, result);

            // Process your document if everything is OK
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

Another way would be to replace the XML title with the XML title plus the DTD reference

Replace this:

<?xml version = "1.0"?>

with this:

<?xml version = "1.0"?><!DOCTYPE ...>

Of course you would replace the first occurance only and not try to go through the whole xml document

You have to instantiate the SAXBuilder by passing true(validate) to its constructor:

SAXBuilder builder = new SAXBuilder(true);

or call:

builder.setValidation(true)
Timmo
  • 3,142
  • 4
  • 26
  • 43
  • 1
    How big is the XML size? Can I have a sample XML? Does the XML document always contain an XML title? – Timmo Feb 04 '11 at 13:26
  • I am using JDOM as opposed to the W3C DOM :( – anonymous Feb 11 '11 at 14:01
  • The xml file will not be that big. The maximum might be about 200 lines with each line being less than 80chars. – anonymous Feb 11 '11 at 14:28
  • Since the file is not big then use the second solution I am providing you. Replace the XML title with the XML title and then the DTD declaration. – Timmo Feb 18 '11 at 11:21
  • Retrieve it as a string then replace the XML title with the XML title plus the DTD declaration; after that parse it as an XML document using JDOM with Validation=true – Timmo Feb 18 '11 at 11:28