5

Is there a way to get XML entities expanded when processing a document using the XMLParser in Swift? I don't see anything in the API but I find it hard to believe this is an exotic requirement.

If the libxml2 API is used directly from Swift this can be done, with the library providing the expanded entity via the characters callback as if it had appeared inline.

Here is my example code:

let xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
    "<!DOCTYPE simple [" +
    "<!ENTITY e \"entity\">" +
    "]>" +
    "<root><val>&e;</val></root>"

class ParserDelegate : NSObject, XMLParserDelegate {
    var entities = [String:String?]()
    var text: String?

    func parser(_ parser: XMLParser, foundInternalEntityDeclarationWithName name: String, value: String?) {
        self.entities[name] = value
    }

    func parser(_ parser: XMLParser, foundCharacters string: String) {
        text = text == nil ? string : self.text! + string
    }
}

let delegate = ParserDelegate()

let parser = XMLParser(data: xml.data(using: String.Encoding.utf8)!)
parser.delegate = delegate


parser.parse()

if let text = delegate.text {
    print("found text:" + text)
}

delegate.entities.forEach() {
    (entry) in let (key, val) = entry
    print(key + ": " + (val ?? "<nil>"))
}

When this is run, "found text" is never printed, because func parser(_ parser: XMLParser, foundCharacters string: String) is never called, as can be verified by setting breakpoints.

The output is simply:

e: entity

I've tried looking through the various methods on the delegate but I don't see anything relevant. The public func parser(_ parser: XMLParser, resolveExternalEntityName name: String, systemID: String?) -> Data? method looks close but these are internal entities and implementing it yields no interesting results.

Edit

It looks like this really is a limitation on the parser which goes back many years.

I can find questions back to 2009 here on stackoverflow:

I found this item from May 2017 on Open Radar:

There is some commentary on entities on the libxml2 library which indicate some of the challenges that entities entail which may help explain why (NS)XMLParser behaves as it does.

Julian
  • 2,837
  • 17
  • 15
  • I can't get this to work either. I even found so older threads (using Objective-C and NSXMLParser) reporting the same problem. If you add any other text around `&e;` then that text is reported. If you change `&e;` to something else, you get an expected error about unresolved entity. But for some reason, `XMLParser` simply won't send the resolved entity in `foundCharacters` (or any other delegate). I'd report this as a bug to Apple. – rmaddy Jun 21 '17 at 17:41
  • Thanks for your feedback, if the only responses I get resemble yours I'll do that. – Julian Jun 22 '17 at 05:31

0 Answers0