1

I am in middle of an Android project where I have to parse international language RSS (title and description is in another language) using SAX. When parsing, I get the following warning in Logcat. Also, none of the items in the RSS is parsed.

**9-10 07:12:33.598: W/System.err(1238): org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 3921: undefined entity**
09-10 07:12:33.598: W/System.err(1238):     at org.apache.harmony.xml.ExpatParser.parseFragment(ExpatParser.java:520)
09-10 07:12:33.598: W/System.err(1238):     at org.apache.harmony.xml.ExpatParser.parseDocument(ExpatParser.java:479)
09-10 07:12:33.598: W/System.err(1238):     at org.apache.harmony.xml.ExpatReader.parse(ExpatReader.java:318)
09-10 07:12:33.608: W/System.err(1238):     at org.apache.harmony.xml.ExpatReader.parse(ExpatReader.java:275)
09-10 07:12:33.608: W/System.err(1238):     at com.example.news.FeedTableViewActivity$SAXHelper.parseContent(FeedTableViewActivity.java:255)
09-10 07:12:33.608: W/System.err(1238):     at com.example.news.FeedTableViewActivity$ParseIndividualFeedTask.doInBackground(FeedTableViewActivity.java:217)
09-10 07:12:33.608: W/System.err(1238):     at com.example.news.FeedTableViewActivity$ParseIndividualFeedTask.doInBackground(FeedTableViewActivity.java:1)
09-10 07:12:33.608: W/System.err(1238):     at android.os.AsyncTask$2.call(AsyncTask.java:185)
09-10 07:12:33.608: W/System.err(1238):     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:306)
09-10 07:12:33.608: W/System.err(1238):     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
09-10 07:12:33.608: W/System.err(1238):     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1088)
09-10 07:12:33.618: W/System.err(1238):     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:581)
09-10 07:12:33.618: W/System.err(1238):     at java.lang.Thread.run(Thread.java:1019)

All other RSS feeds in English does not show any warning and hence its items are parsed.

A sample xml is here and here is my code portion:

public void parseUsingSAX(currentUrl){
    SAXParserFactory spf = SAXParserFactory.newInstance();
    SAXParser sp = spf.newSAXParser();
    XMLReader xr = sp.getXMLReader();
    xr.setContentHandler(df);
    InputSource is = new InputSource(currentUrl.openStream());
    is.setEncoding("UTF-8");
    xr.parse(is);
}

Any help would be highly appreciated! Thanks!

javaCity
  • 4,288
  • 2
  • 25
  • 37
  • perhaps, you are facing this problem: [Unicode Regex; Invalid XML characters][1] [1]: http://stackoverflow.com/questions/397250/unicode-regex-invalid-xml-characters – mostafa.S Sep 11 '12 at 04:34
  • mostafa, I tried this but it gives me a `MalformedURLException` now. I read the response xml in string and removed all the extraneous characters but I still get the error. Thank you for your help though! – javaCity Sep 11 '12 at 14:58

1 Answers1

1

The exception tells you that in your xml that you are parsing there is a problem at line 1 column/character 3921.

I had a similar problem and it was caused by using a false codepage.

Have a look at the place, where the exception points you to and if it is a character you are expecting.

AlexS
  • 5,295
  • 3
  • 38
  • 54