1

I need to escape some text before displaying the contents on the webpage, and this in fact is being done correctly. However when I display the String in html, the escape characters will still be displayed. The following is such an example:

hello there my ni&%ame is + - && !

and the respective string with escaping is the following:

hello there my ni&%ame is + - && !

I've read somewhere that the core in taglib will only escape the basic ones such as >, < , ", \t and space. however none of these escape sequences are removed from the html code. Does any of you know how to be able to solve this problem please? thanks

the following is part of the code used to convert a specific character to its escape character:

while (character != CharacterIterator.DONE ){
         if (character == '<') {
           result.append("&lt;");
         }
         else if (character == '>') {
           result.append("&gt;");
         }
         else if (character == '&') {
           result.append("&amp;");


                } .....
       return result;
}

the escaping part is done and works perfectly.. the problem occurs when i try to display the string with escaped characters onto an html page

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
ict1991
  • 2,060
  • 5
  • 26
  • 34
  • Okay, you should explicitly mention and tag that as such. "Java" is an extremely broad subject and "taglib" is not specific enough. Well, as to your concrete problem, I fail to see/understand why exactly that is a problem. You may see them being escaped in generated HTML source, but you should see the desired characters properly in the browser view. Please elaborate the concrete problem. – BalusC Jan 11 '12 at 23:06
  • @NikitaBeloglazov I managed to escape the string in the first place... the problem is that when i try to display the escaped string in html, the escaped characters are still being displayed and are not converted to their respective proper characters – ict1991 Jan 11 '12 at 23:06
  • Escaping != removing. Why would you want some characters to be removed? Escaping consists in replacing special HTML characters by an escape sequence so that they can be displayed and are not interpreted as HTML markup. You should also show us how you're escaping, because there's no reason for + and - to be escaped: they're not special HTML chars. Same for \t and space. – JB Nizet Jan 11 '12 at 23:09
  • Can you provide some code where you insert string to page? – Mikita Belahlazau Jan 11 '12 at 23:11
  • what i meant was that i would like the &amp be replaced with the actual & when displayed in html and not actually print &amp... the second sentence above is how it is being displayed in html and I would like it to be displayed 'normally'... as is sentence one... sorry for not explaining myself clearly – ict1991 Jan 11 '12 at 23:12
  • Show us your code. You're not escaping properly, or you're doing it twice. – JB Nizet Jan 11 '12 at 23:13
  • Did you try to insert string to page without manually escaping? – Mikita Belahlazau Jan 11 '12 at 23:17
  • yes and then the string is displayed correctly – ict1991 Jan 11 '12 at 23:17
  • How are you displaying the string? (And why aren't you using any of several existing mechanisms to do all this?) – Dave Newton Jan 11 '12 at 23:19
  • So why do you need to escape string manually, if it's displayed correctly without escaping. I think your taglib does this work for you. – Mikita Belahlazau Jan 11 '12 at 23:20
  • If you use to write your escaped string, you're escaping it a second time, since 's point **is** to escape. The diagnostic would be easier if you showed us how you display your string, as already asked several times. – JB Nizet Jan 11 '12 at 23:21

2 Answers2

3
if (character == '<') {
    result.append("&lt;");
}
else if (character == '>') {
    result.append("&gt;");
// ...

Remove this. You don't need it. The JSTL <c:out> already does this job.

<c:out value="${someBean.someProperty}" />

Your HTML string is otherwise escaped twice. Each & becomes an &amp; again and so on. If you really need to take the escaping in own hands (why?) then just don't use <c:out> at all:

${someBean.someProperty}

or turn off its escaping by escapeXml="false":

<c:out value="${someBean.someProperty}" escapeXml="false" />
BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
  • ohh ok i think i was escaping it twice then.... i did not know that the c:out escapes as well – ict1991 Jan 11 '12 at 23:24
  • You're welcome. The point of `` is to HTML-escape user-controlled input in order to prevent XSS attacks. See also http://stackoverflow.com/questions/2658922/xss-prevention-in-java – BalusC Jan 12 '12 at 00:29
1

BalusC has nailed it.

A couple of additional points:

  • If you get problems with web pages not looking right, one of the things you should do is to look at the raw HTML using your web browser's "view source" function. In this case, it would have shown the double escaping, and a quicker realization of what the problem was.

  • In HTML, you should only need to escape <, > and &. Other characters should work just fine provided that your HTML is encoded in UTF-8 (and the content type says so too).

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216