0

When I display bullet-points, copyright symbols, trademark signs in a web browser, they look fine.

// bullets: http://losangeles.craigslist.org/wst/acc/2900906683.html
// bullets: http://losangeles.craigslist.org/lac/acc/2902059059.html
// bullets: http://indianapolis.craigslist.org/acc/2867115357.html
// bullets: http://indianapolis.craigslist.org/ofc/2885697780.html
// bullets: http://indianapolis.craigslist.org/ofc/2887554512.html
// copyright: http://chicago.craigslist.org/nwc/acc/2854640931.html

But I get "question marks inside triangles" when I use an Android WebView with:

web.loadDataWithBaseURL(null, myHtml, null, "UTF-8", null);

Should I be using a different encoding?

Should I be searching/replacing certain characters myself... 1-by-1?

Carol
  • 407
  • 6
  • 14
  • UTF-8 is good, if you're on android 4.0 I had an issue with loadDataWithBaseURL, try just using loadData – bbedward Mar 08 '12 at 17:37
  • loadData() was even worse. It wouldn't even correctly display some quotes and apostrophes. – Carol Mar 08 '12 at 21:21
  • I can't require OS 4.0. It's only being used on an EXTREMELY small percent of the devices in the world. I'm compiling against v4.0.3, but require only v2.2. Still, same problem. – Carol Mar 08 '12 at 21:26

2 Answers2

2

Try using WebView settings

myWebView = (WebView)findViewById(R.id.mywebView);
WebSettings settings = myWebView.getSettings();
settings.setDefaultTextEncodingName("UTF-8");
bbedward
  • 6,368
  • 7
  • 36
  • 59
  • Setting UTF-8 with WebSettings, seems to have the same problem as using loadDataWithBaseURL(): Can't display bullets. – Carol Mar 14 '12 at 22:52
1

I've run into this problem before. I would make sure that your myHtml String already has good encoding before you load it into your WebView. You can check that by logging it using Log.d(). If the encoding is wrong in that String, that it won't show properly in WebView either. You'll see those weird characters in LogCat.

If that is the case, you'll want to make sure that when you're reading the data into your myHtml String, that you use something like an InputStreamReader and pass it "UTF-8" as the character encoding.

I would change the line of code that you're using from:

BufferedReader buffer = new BufferedReader(new InputStreamReader(content), 1000);

to:

BufferedReader buffer = new BufferedReader(new InputStreamReader(content, "UTF-8"), 1000);

This version of the constructor is documented to:

Constructs a new InputStreamReader on the InputStream in. The character converter that is used to decode bytes into characters is identified by name by enc. If the encoding cannot be found, an UnsupportedEncodingException error is thrown.

at http://developer.android.com/reference/java/io/InputStreamReader.html and look at the second one.

EDIT: If that doesn't work, you could try using:

String s = EntityUtils.toString(entity, HTTP.UTF_8);

which is from Android Java UTF-8 HttpClient Problem

Community
  • 1
  • 1
louielouie
  • 14,881
  • 3
  • 26
  • 31
  • LogCat diplays a small "box" character... but that's probably because of the font I'm using there. Still not sure what the solution would be. – Carol Mar 15 '12 at 00:08
  • That "box" character is an ascii 149. So I guess I can do a replaceAll()... and then do all the copyright characters, and trademark signs, etc. But what a waste of resources. – Carol Mar 15 '12 at 01:03
  • I added some code to help you further. You can use specify the encoding via that code to avoid doing replaceAll(). – louielouie Mar 15 '12 at 03:43
  • Even with the UTF-8 at the InputStreamReader level... still seeing "question marks inside black triangles" for bullets and certain characters. – Carol Mar 16 '12 at 19:38
  • Sorry that the initial answer didn't work. I think you are getting the HTTP response differently. I looked into it and found a potential alternative code answer. – louielouie Mar 16 '12 at 22:05