0

I've been searching in google, but I've not found any answer for my trouble.

I was testing my Java Application and I've noticed that when I created a username with accented characters (HTML special characters such as á, é, í, ó, ú) it didn't show that characters well. I mean, for example: a user called Álvaro shows �lvaro.

Do you know any function in Java that converts that special characters?

Jesse
  • 8,605
  • 7
  • 47
  • 57
David
  • 23
  • 1
  • 7
  • How are you obtaining the output stream to which you write these characters? What headers are you serving the HTML with? – Mike Samuel May 17 '13 at 12:17

3 Answers3

0

you need to escape HTML character using StringEscapeUtils.escapeHtml.

 StringEscapeUtils.escapeHtml("Ávaro");
Subhrajyoti Majumder
  • 40,646
  • 13
  • 77
  • 103
0

You can also change your HTML page encoding to UTF-8...

For a page HTML created in eclipse you insert this header in the page

<%@ page language="java"
contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>
Manuel Pires
  • 637
  • 3
  • 19
  • Isn't this JSP specific? For an arbitrary HTML page, you should use a `` tag as explained at [meta charset vs http-equiv](http://stackoverflow.com/questions/4696499/meta-charset-utf-8-vs-meta-http-equiv-content-type) – Mike Samuel May 17 '13 at 12:16
0

Ideally, you server your HTML with a Content-type header that specifies the charset used to encode the HTML.

If that's not an option, the easiest way to encode non-ASCII characters in such a way that you can server HTML with any charset is to use numeric entities : 'Á' -> &#193;.

If you know that your content is already HTML, then the below will escape it so that it can be served using a wide variety of encodings including ASCII and UTF-8.

public static String escapeHTML(String htmlTextNodeValue) {
  int n = htmlTextNodeValue.length();
  int encoded = 0;
  StringBuilder out = null;
  for (int i = 0, charCount; i < n; i += charCount) {
    int codePoint = htmlTextNodeValue.codePointAt(i);
    charCount = Character.charCount(codePoint);

    if (codePoint > 0x7f
        || codePoint == '<' || codePoint == '>' || codePoint == '&'
        || codePoint == '"' || codePoint == '\'') {
      if (out = null) { out = new StringBuilder(n + 1024); }
      out.append(htmlTextNodeValue, encoded, i));
      encoded = i + charCount;
      switch (codePoint) {
        case '<': out.append("&lt;"); break;
        case '>': out.append("&gt;"); break;
        case '&': out.append("&amp;"); break;
        default:  out.append("&#").append(codePoint).append(';');
      }            
    }
  }
  if (out != null) {
    return out.append(htmlTextNodeValue, encoded, n).toString();
  } else {
    return htmlTextNodeValue;
  }
}
Mike Samuel
  • 118,113
  • 30
  • 216
  • 245