0

I am working on Android App for RSS reader. I have a problem when I am reading data from different URLs as there are several different character encodings used in the rss feeds, e.g. UTF-8 and ISO-8859-1.

I am using Volley StringRequest to read content from RSS and I am getting following Error for some RSS feeds..

BasicNetwork.performRequest: Unexpected response code 404 for http://khabar.ibnlive.com/rss/khabar/ghar-parivar/health.xml

This is the code I am using for UTF-8 in Parsing.

int currentapiVersion = android.os.Build.VERSION.SDK_INT;
if (currentapiVersion >= Build.VERSION_CODES.KITKAT) {
     InputStream stream = new ByteArrayInputStream(response.getBytes(StandardCharsets.UTF_8));
     xpp.setInput(stream, null);
}
else{
     InputStream stream = new ByteArrayInputStream(response.getBytes(Charset.forName("UTF-8")));
     xpp.setInput(stream, null);
}

The code is working fine with UTF-8 charset URLs like http://www.oneindia.com/rss/feature-fb.xml but showing above error with ISO-8859_1 supporting urls.

I have to read data from multiple RSS feeds so can anyone help me please to detect these charset and how can I convert these to UTF-8 charset or you can suggest any better option for this task.

Abhishek T.
  • 1,133
  • 1
  • 17
  • 33

2 Answers2

1

Try to use charset detector. It hasn't be everytime UTF-8, which you are specifing by StandardCharsets.UTF_8.

Recommended literature: What is the most accurate encoding detector?

Přemysl Šťastný
  • 1,676
  • 2
  • 18
  • 39
1

Once, I had same problem while reading RSS feeds in my android application. You should check this url in Postman add see if it returns proper content or not. Sometime server checks for user-agent in header and then return response accordingly.

As you mention you are using volley to make network request, you should override getHeaders() method like this

@Override
        public Map<String, String> getHeaders() throws AuthFailureError {
            Map<String, String>  params = new HashMap<String, String>();

                params.put("data-type", "application/text");
            params.put("User-agent", "Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0");

                return params;
        }

and also override parseNetworkResponse method to parse raw response to UTF-8 string.

@Override
protected Response<String> parseNetworkResponse(NetworkResponse response) {

    try {
        String utf8String = new String(response.data, "UTF-8");
        return Response.success(utf8String, HttpHeaderParser.parseCacheHeaders(response)) ;
    } catch (UnsupportedEncodingException e) {
        return Response.error(new ParseError(e));
    }
}

I hope this will work as it work for me.

Abhishek T.
  • 1,133
  • 1
  • 17
  • 33
R.K.Saini
  • 2,678
  • 1
  • 18
  • 25
  • You are writing params.put("User-agent", "Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0"); Why should I use this. – Abhishek T. Jul 23 '16 at 17:31
  • To mimic user-agent as mozilla because sometime server may give you different result based on user agent, If you open this url http://www.amarujala.com/rss/uttarakhand-news.xml using browser it gives xml content and when you open it in postman it give html content rather than xml data. – R.K.Saini Jul 23 '16 at 17:42
  • So would I have to write user-agent for every browser ? Or will it work for chrome, safari, IE and Edge ? how can I find that. – Abhishek T. Jul 23 '16 at 17:49
  • No, You only need to send one mozilla,crome or ie . I do it with mozilla . – R.K.Saini Jul 23 '16 at 17:54
  • OK. Its really helpful. Thank you for your answer. – Abhishek T. Jul 24 '16 at 03:56