The answer here is you're screwed. They are using a non-standard encoding for XML, but what if they really want the literal \U2026
? Let's say you add a decoder to handle all \UXXXX
and \uXXXX
encodings. What happens when another feed want the data to be the literal \U2026
?
You're first choice and best bet is to get this feed fixed. If they need to encode data, they need to use proper HTML entities or numeric references.
As a fallback, I would isolate the decoder away from the XML parser. Don't create a non-conforming XML parser just because your getting non-conforming data. Have a post processor that would only be run on the offending feed.
If you must have a decoder, then there is more bad news. There is no built in decoder, you will need to find a category online or write one up yourself.
After some poking around, I think Using Objective C/Cocoa to unescape unicode characters, ie \u1234 may work for you.