Currently trying to implement an Android version of my iOS application and running into some issues parsing XML where the text contains a single quote or double quote character (it's a dictionary app for a foreign language).
All of my app's data is loaded from an XML resource file. Here's an example of that file:
<entry>
<word>afa'i fā</word>
<definition>See under "afa". Figurative (especially in poetry), king or queen: "hotau afa'i fā".</definition>
</entry>
I retrieve an XmlResourceParser
by calling:
XmlResourceParser parser = getResources().getXml(R.xml.data);
parse(parser);
Here's my parsing code:
public void parse(XmlResourceParser parser) throws XmlPullParserException, IOException {
int eventType = parser.getEventType();
while (eventType != XmlPullParser.END_DOCUMENT) {
switch (eventType) {
case XmlPullParser.START_TAG:
startTag(parser.getName(), parser);
break;
case XmlPullParser.END_TAG:
endTag(parser.getName(), parser);
break;
case XmlPullParser.TEXT:
foundText(parser.getText());
break;
default:
break;
}
eventType = parser.next();
}
}
When parsing the text, XmlResourceParser
's getText()
method drops everything after the '
and picks right back up with the text inside of the next node. Additionally, it just ignores the double quotes. My result looks like this:
(word)
afa
(definition)
See under afa. Figurative (especially in poetry), king or queen: hotau afa
I've scoured the docs and can't seem to find any mention of dealing with single and double quotes in the documentation... The only thing I can think is that the XmlResourceParser
doesn't like the literal characters and is instead expecting entity codes, but I've tried a swapping them out and it still ignores them.