Is there a way to get XML entities expanded when processing a document using the XMLParser in Swift? I don't see anything in the API but I find it hard to believe this is an exotic requirement.
If the libxml2 API is used directly from Swift this can be done, with the library providing the expanded entity via the characters callback as if it had appeared inline.
Here is my example code:
let xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<!DOCTYPE simple [" +
"<!ENTITY e \"entity\">" +
"]>" +
"<root><val>&e;</val></root>"
class ParserDelegate : NSObject, XMLParserDelegate {
var entities = [String:String?]()
var text: String?
func parser(_ parser: XMLParser, foundInternalEntityDeclarationWithName name: String, value: String?) {
self.entities[name] = value
}
func parser(_ parser: XMLParser, foundCharacters string: String) {
text = text == nil ? string : self.text! + string
}
}
let delegate = ParserDelegate()
let parser = XMLParser(data: xml.data(using: String.Encoding.utf8)!)
parser.delegate = delegate
parser.parse()
if let text = delegate.text {
print("found text:" + text)
}
delegate.entities.forEach() {
(entry) in let (key, val) = entry
print(key + ": " + (val ?? "<nil>"))
}
When this is run, "found text" is never printed, because func parser(_ parser: XMLParser, foundCharacters string: String)
is never called, as can be verified by setting breakpoints.
The output is simply:
e: entity
I've tried looking through the various methods on the delegate but I don't see anything relevant. The public func parser(_ parser: XMLParser, resolveExternalEntityName name: String, systemID: String?) -> Data?
method looks close but these are internal entities and implementing it yields no interesting results.
Edit
It looks like this really is a limitation on the parser which goes back many years.
I can find questions back to 2009 here on stackoverflow:
- NSXMLParser not resolving internal entity
- How to resolve an internally-declared XML entity reference using NSXMLParser
- Resolving html entities with NSXMLParser on iPhone
I found this item from May 2017 on Open Radar:
There is some commentary on entities on the libxml2 library which indicate some of the challenges that entities entail which may help explain why (NS)XMLParser behaves as it does.