Yes, I know that general forms of this question have been asked time and time again. However, I couldn't find anything that helped me solve my problem, so am posting this question which is specifically about my problem.
I am trying to figure out why I am getting a SAXParseException
(Content is not allowed in prolog.
) as the OpenSAML library is trying to parse some XML. The most useful hints I found pointed toward an errant BOM at the beginning of the file, but there's nothing like that. I also wrote a quick-and-dirty C#.NET routine to read the whole file as an array of bytes, iterate over it and tell me if any of them were >=0x80 (it found none). The XML is marked as utf-8. I am hoping that someone can provide me with a bit of insight as to what might be going wrong.
The initial portion of the XML file, as a hex dump, is (note the use of 0A
as a newline; removing the line feed character entirely has no apparent effect):
000000000 3C 3F 78 6D 6C 20 76 65-72 73 69 6F 6E 3D 22 31 |<?xml version="1|
000000010 2E 30 22 20 65 6E 63 6F-64 69 6E 67 3D 22 55 54 |.0" encoding="UT|
000000020 46 2D 38 22 3F 3E 0A 3C-6D 64 3A 45 6E 74 69 74 |F-8"?>.<md:Entit|
000000030 79 44 65 73 63 72 69 70-74 6F 72 20 78 6D 6C 6E |yDescriptor xmln|
000000040 73 3A 6D 64 3D 22 75 72-6E 3A 6F 61 73 69 73 3A |s:md="urn:oasis:|
000000050 6E 61 6D 65 73 3A 74 63-3A 53 41 4D 4C 3A 32 2E |names:tc:SAML:2.|
000000060 30 3A 6D 65 74 61 64 61-74 61 22 20 |0:metadata" |
The stack trace for the root cause exception is:
org.xml.sax.SAXParseException: Content is not allowed in prolog.
org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source)
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
org.opensaml.xml.parse.BasicParserPool$DocumentBuilderProxy.parse(BasicParserPool.java:665)
my.Unmarshaller.unmarshall(Unmarshaller.java:39)
... internal calls omitted for brevity ...
javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
The code that tries to do the unmarshalling is (type names fully qualified here; hopefully I am not leaving out something important):
package my;
public class Unmarshaller {
protected static org.opensaml.xml.parse.ParserPool parserPool;
static {
org.opensaml.xml.parse.BasicParserPool _parserPool;
_parserPool = new org.opensaml.xml.parse.BasicParserPool();
_parserPool.setNamespaceAware(true);
Unmarshaller.parserPool = _parserPool;
}
public Unmarshaller() {
try {
org.opensaml.DefaultBootstrap.bootstrap();
} catch (org.opensaml.xml.ConfigurationException e) {
throw new java.lang.RuntimeException (e);
}
}
public Object unmarshall(String xml)
throws org.opensaml.xml.io.UnmarshallingException {
assert xml != null;
assert !xml.isEmpty();
assert Unmarshaller.parserPool != null;
org.w3c.dom.Document doc;
try {
doc =
(parserPool.getBuilder())
.parse( // <<<====== line 39 in original source code is here
new org.xml.sax.InputSource(
new java.io.StringReader(xml)
)
);
} catch (org.xml.sax.SAXException e) {
throw new org.opensaml.xml.io.UnmarshallingException(e);
} catch (java.io.IOException e) {
throw new org.opensaml.xml.io.UnmarshallingException(e);
} catch (org.opensaml.xml.parse.XMLParserException e) {
throw new org.opensaml.xml.io.UnmarshallingException(e);
}
// ... remainder of function omitted for brevity ...
}
}