0

I have this HTML escaping method:

public static String stringToHTMLString(String string) {
    StringBuffer sb = new StringBuffer(string.length());
    // true if last char was blank
    boolean lastWasBlankChar = false;
    int len = string.length();
    char c;

    for (int i = 0; i < len; i++)
        {
        c = string.charAt(i);
        if (c == ' ') {
            // blank gets extra work,
            // this solves the problem you get if you replace all
            // blanks with &nbsp;, if you do that you loss 
            // word breaking
            if (lastWasBlankChar) { // NOT going into this loop
                lastWasBlankChar = false;
                sb.append("&nbsp;");
                }
            else {
                lastWasBlankChar = true;
                sb.append(' ');
                }
            }
        else {
            lastWasBlankChar = false;
            //
            // HTML Special Chars
            if (c == '"')
                sb.append("&quot;");
            else if (c == '&')
                sb.append("&amp;");
            else if (c == '<')
                sb.append("&lt;");
            else if (c == '>')
                sb.append("&gt;");
            else if (c == '\n')
                // Handle Newline
                sb.append("&lt;br/&gt;");
            else {
                int ci = 0xffff & c;
                if (ci < 160 )
                    // nothing special only 7 Bit
                    sb.append(c);
                else {
                    // Not 7 Bit use the unicode system
                    sb.append("&#");
                    sb.append(new Integer(ci).toString());
                    sb.append(';');
                    }
                }
            }
        }
    return sb.toString();
}

When I pass it with the string "bo y", it returns "bo y". When I change the input string to "bo>y", it correctly escapes the string. Any idea why the space escaping isn't working?

Thanks.

r123454321
  • 3,323
  • 11
  • 44
  • 63

3 Answers3

1

Works fine when I run it, I get:

stringToHTMLString("This is  a   multi-space      test")
This is &nbsp;a &nbsp; multi-space &nbsp; &nbsp; &nbsp;test

Hmm, now that I think about it, were you expecting the first space to be escaped? Follow the logic, it starts with a space first and then a non-breaking space alternately, since it's initially false.

This doesn't answer your actual question, but a better way of doing what you're trying to do is to use CSS's white-space: pre-wrap; on the element... if you can get away with supporting IE8+. Otherwise, for older IE, you have to use

white-space: normal !important;
white-space: pre-wrap;
word-wrap: break-word;

Your definition of 7-bit safe characters is also... interesting. Might be better to use UTF-8 unless you have to support Windows 98, rather than manually escaping unusual characters, and probably drop non-formatting control codes entirely.

SilverbackNet
  • 2,076
  • 17
  • 29
  • Ahhhh, yeah, okay. Yeah, misread the code. Essentially, what I'm trying to do is write a method that can take in a search query, and pass that search query into another method that uses a music website's API to search for artists. Obviously, I need to take care of spaces and such in the input string. What is your suggestion for the easiest way to take care of this? As of now, the code doesn't handle spaces that are of length 1; would the most reasonable solution just be to change this aspect of the code, or is there a standard that I don't know of for doing something like this? Thanks! – r123454321 Jul 28 '12 at 00:55
  • It doesn't handle them since HTML doesn't collapse a single space, so you don't need it. If you REALLY want to, you can make lastWasBlankChar true to start, but then you break word-wrapping because every single space becomes non-breaking. – SilverbackNet Jul 28 '12 at 00:59
1

Judging by your comments, I believe you want to escape a string to be used in a URL for a music website's API.

I must suggest that you take advantage of 3rd party libraries.

You can use:      java.net.URLEncoder.encode(String s, String encoding)

e.g.

URLEncoder.encode(searchQuery, "UTF-8");

Source: Encoding URL query parameters in Java

Community
  • 1
  • 1
Nick Eaket
  • 146
  • 3
0

Looks like stack overflow may have escaped your second string.
Was the second "boy" suppose to be "bo&nbsp;y".?

Nick Eaket
  • 146
  • 3
  • (Unfortunately) not. When I put "bo y" in as argument, (with the space between o and y), it returns exactly what I put in, with still a space between the o and y. – r123454321 Jul 28 '12 at 00:48