7

I'm using the Apache web service xml rpc library to make requests to an rpc service. Somewhere in that process is a xml document with a DTD reference to http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd, which the library attempts to download when parsing the XML. That download fails with a 503 status code because the w3c is blocking repeated downloads of this largely static document from Java clients.

The solution is XML Catalogs to locally cache the DTD. However, while I can find examples of setting an EntityHandler on a JAXP SAXParser instance directly to enable catalog parser support, I don't actually have access to the underlying parser here. It's just being used by the xml rpc library. Is there any way I can set a global property or something that will tell JAXP to use XML catalogs?

Brian Ferris
  • 7,557
  • 5
  • 25
  • 27

2 Answers2

1

I think you want the system property xml.catalog.files.

Take a look at http://xml.apache.org/commons/components/resolver/resolver-article.html

BTW, this was the third hit on a Google search for jaxp catalog

Jim Garrison
  • 85,615
  • 20
  • 155
  • 190
  • I'd seen that article and I've already attempted to integrate xml-resolver into my project. The problem is that the xml.catalog.files system property only takes affect once you've installed the XML Resolver as your entity resolver on your JAXP reader instance. My problem is that I don't have access to the JAXP reader instance used internally by the web service library. – Brian Ferris Jun 19 '10 at 19:19
  • If you set that property on the command line when launching the program, does it not get "seen" by the parser factory? – Jim Garrison Jun 21 '10 at 19:31
1

Unfortunately, setting xml.catalog.files does NOT have any effect on the parser factory. Ideally it should, of course, but the only way to use a resolver is to somehow add a method that delegates resolution to the catalog resolver in the handler that the SAX parser uses.

If you are already using a SAX parser, that's pretty easy:

 final CatalogResolver catalogResolver = new CatalogResolver();
  DefaultHandler handler = new DefaultHandler() {
        public InputSource resolveEntity (String publicId, String systemId) {
            return catalogResolver.resolveEntity(publicId, systemId);
        }
        public void startElement(String namespaceURI, String lname, String qname,
           Attributes attrs) { 
           // the stuff you'd normally do
        }
        ...
     };

  SAXParserFactory factory = SAXParserFactory.newInstance();
  factory.setNamespaceAware(true);
  SAXParser saxParser = factory.newSAXParser();
  String url = args.length == 0 ? "http://horstmann.com/index.html" : args[0];
  saxParser.parse(new URL(url).openStream(), handler);

Otherwise, you'll need to figure out if you can supply your own entity resolver. With a javax.xml.parsers.DocumentBuilder, you can. With the scala.xml.XML object, you can't but you can use subterfuge:

val res = new com.sun.org.apache.xml.internal.resolver.tools.CatalogResolver

val loader = new factory.XMLLoader[Elem] {
  override def adapter = new parsing.NoBindingFactoryAdapter() {
    override def resolveEntity(publicId: String, systemId: String) = {
      res.resolveEntity(publicId, systemId) 
    }
  }
}

val doc = loader.load(new URL("http://horstmann.com/index.html"))enter code here
cayhorstmann
  • 3,192
  • 1
  • 25
  • 17