0

I want to parse following code to html code and show in a WebView. This works fine, but...

Code to parse:

<img src="http://...jpeg" alt="„Indoor Maps“ von Google" align="left" style="padding-right:5px">\n\n\nEinfachere Navigation in Gebäuden verspricht Indoor Maps von Google. Der Praxis-Test von COMPUTER BILD im Hamburger „Alsterhaus“ verlief aber kurios.<br>Foto: ComputerBILD<br>

attempt 1) Html.toHtml(Code) - The umlauts and quotes of the texts where parsed fine and the img-tag is still valid (quotes). But some img-attributes were removed, like alt and align. Result:

<p><img src="http://...jpeg"> Einfachere Navigation in Geb&#228;uden verspricht Indoor Maps von Google. Der Praxis-Test von COMPUTER BILD im Hamburger &#8222;Alsterhaus&#8220; verlief aber kurios.<br>\nFoto: ComputerBILD<br>\n</p>\n

attempt 2) external library: org.apache.commons.lang3.StringEscapeUtils.escapeHtml4(Code) - All umlauts and quotes where parsed. The img-tag is corrupted by parsing the quotes. Now i can't show the image on a WebView. The img-tags where not removed. Result:

&lt;img src=&quot;http://...jpeg&quot; alt=&quot;&bdquo;Indoor Maps&ldquo; von Google&quot; align=&quot;left&quot; style=&quot;padding-right:5px&quot;&gt;\n\n\nEinfachere Navigation in Geb&auml;uden verspricht Indoor Maps von Google. Der Praxis-Test von COMPUTER BILD im Hamburger &bdquo;Alsterhaus&ldquo; verlief aber kurios.&lt;br&gt;Foto: ComputerBILD&lt;br&gt;

I know there are a lot of posts of this category, but I can't find help to parse the html code and don't "touch" the quotes of attributes. I am stucking.

EDIT

This is the full Html code

    StringBuilder html = new StringBuilder();
    html.append("<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" \"http://www.w3.org/TR/html4/loose.dtd\">");
    html.append("<html>");
    html.append("<head>");
    html.append("<meta http-equiv=\"Content-Type\" content=\"text/html; charset=ISO-8859-1\">");
    html.append("<title></title>");
    html.append("</head>");
    html.append("<body bgcolor=\"white\" leftmargin=\"0\" topmargin=\"0\">");       
    html.append(CODE AT THE TOP);
    html.append("</body>");
    html.append("</html>");

When I use UTF-8 I got the same result...

webView.loadData(html.toString(), "text/html", "iso-8859-1");

@Christiaan: This is the current result, when i set the unparsed code to WebView

Community
  • 1
  • 1
Gepro
  • 583
  • 1
  • 11
  • 20

3 Answers3

1

Are you sure you want to use toHtml? It looks like you already have html and you should be using Html.fromHtml() or even nothing at all. Just keep the string as is and display it in the WebView?

Christiaan
  • 2,637
  • 21
  • 26
1

Ah, now it looks like an encoding issue. Try using UTF-8 in your source, your html and in this snippet that you want to insert.

As in:

html.append("<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">");

and

webView.loadData(html.toString(), "text/html", "UTF-8");

Make sure the "CODE AT THE TOP" is also in UTF-8.

Make sure your source code is also in UTF-8 (search for Encoding in your ide)

Christiaan
  • 2,637
  • 21
  • 26
  • As i wrote, UTF-8 gives the same result (look at Image). My question is, HOW can i parse that Html to have parsed umlauts and valid html tags and attributes. – Gepro Jan 22 '13 at 10:43
0

I've found this post, now it works :) Android. WebView and loadData


myWebView.loadData(myHtmlString, "text/html; charset=UTF-8", null);

This works flawlessly, especially on Android 4.0, which apparently ignores character encoding inside HTML. Tested on 2.3 and 4.0.3.

Community
  • 1
  • 1
Gepro
  • 583
  • 1
  • 11
  • 20