I am having a problem with data that is a JSON file. I am using the following link, from google.
http://www.google.com/finance/company_news?q=AAPL&output=json"
My problem occurs when i want to parse the data and putting it on screen. The data is not being decoded properly from some reason.
The raw data:
1.) one which must have set many of the company\x26#39;s board on the edge of their
2.) Making Less Money From Next \x3cb\x3e...\x3c/b\x3e
When i bring in the data i do the following:
DefaultHttpClient httpClient = new DefaultHttpClient();
HttpPost httpPost = new HttpPost(url);
HttpResponse httpResponse = httpClient.execute(httpPost);
HttpEntity httpEntity = httpResponse.getEntity();
is = httpEntity.getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(
is, "iso-8859-1"), 8);
StringBuilder sb = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null) {
sb.append(line + "n");
}
is.close();
json = sb.toString();
The Output i receive, using org.json to extract the data from the json file, is the following(notice the lack of backslash):
1.)one which must have set many of the companyx26#39;s board on the edge of their
2.)Making Less Money From Next x3cbx3e...x3c/bx3e
my current method for handling the first problem by this:
JSONRowData.setJTitle((Html.fromHtml((article.getString(TAG_TITLE).replaceAll("x26", "&")))).toString());
the second one escapes me though(no pun intended)
I assume the reason that this doesn't work is being the backlash is used for escape characters. Ive tried many different methods of reading the data in but ive had no luck. Is there a way i can import the data to handle this problem without using regular expressions?
Solution
Our nemesis today: "\x26" -- ASCII (in Hexadecimal Notation)
Read the Raw data into a Char Array. commons.io library from apache is a great way to do this. Once you do this, read the char array in a for loop looking for "\", if you have a hit then look for "x" in the next array position. If you have a hit again then take the next two characters in the char array. These two characters are your ASCII hex values. Convert the hex into decimal form then cast the decimal to a char. Take this Character and append it to a string builder.
If there is no match(with "\") then append the char to a string builder. We can now call the .toString()
method and turn it into a string.
From there, the data may contain some HTML remnants(' and/or in this case). Using Html.fromHtml() Took care of this.