Assume the following:
I have a set of XSD schemas
S
, each with distinct namespace URIs.I know that I'm going to be receiving an XML document containing a root element that contains exactly one namespace declaration that refers to a member of
S
. I can abort parsing immediately with an error if I don't receive exactly one namespace declaration, or if the received namespace doesn't refer to any schema inS
.
I want to parse the incoming XML document with a SAX parser, and I want to validate the incoming document during parsing against one of the schemas in S
. I know from the above that the first call I'm going to see in the ContentHandler
that I give to the parser will be a call to startPrefixMapping
when the parser encounters the namespace declaration.
Is it possible to, in the startPrefixMapping
call, pick one of the schemas in S
for validation once I know which one I need?
It seems that I could maybe call setSchema
on the parser inside the startPrefixMapping
call, but I get the feeling from the API documentation that I'm not supposed to do this (and that it may be too late to call the method at that point anyway).
Is there some other way to supply a set of schemas to the parser and perhaps have it pick the right one itself based on the namespace declaration it receives?
Edit: I was wrong, it's not just inadvisable to call setSchema
on a parser once parsing has started - it's actually impossible. Parsers don't expose a setSchema
call, only parser factories do. This means that my options are limited to those that can allow the parser to select a schema for itself. Unfortunately, that has its own problems: It's not possible for an XML document to merely specify a namespace, it also has to specify a filename for the intended schema (which in my opinion is an implementation detail on the parser side and should not be required of the incoming data) and the parser has to intercept the request for this filename to supply a member of S
for validation.
Edit: I've solved this. I've put together some heavily-commented public domain example code here that looks up schemas based on pre-specified systemIds, and the schemas are delivered programatically (so they can be served from databases, class resources, etc). It correctly rejects any document that specifies an unknown schema, specifies no schema, or tries to specify its own schemaLocation to try to fool the validator.