Java 5 HTML escaping To Prevent XSS

Question

I'm looking into some XSS prevention in my Java application.

I currently have custom built routines that will escape any HTML stored in the database for safe display in my jsps. However I would rather use a built in/standard method to do this if possible.

I am not currently encoding data that gets sent to the database but would like to start doing that as well.

Are there any built in methods that can help me to achieve this?

Please beware that the accepted answer below is an incomplete and naive approach. `Encoding the "big 5" serves exactly the purpose it was designed for: prevents injecting HTML markup with illegal characters inside tags and attribute values. However it does not prevent more elaborate injections, does not help with "out of range characters = question marks" when outputting Strings to Writers with single byte encodings, nor prevents character reinterpretation when user switches browser encoding over displayed page` http://www.owasp.org/index.php/How_to_perform_HTML_entity_encoding_in_Java — Pool, Mar 05 '10 at 01:47
Would you not say that the accepted answer is in general correct, it's just that a larger range of characters needs to be accounted for when the HTML is escaped? — AJM, Mar 08 '10 at 12:04
It's good to keep the door locked, but probably best not to leave the windows wide open. It is dangerous advice. — Pool, Mar 22 '10 at 19:18
Escaping is not the solution particularly if you want users to enter a subset of HTML through a rich text editor like tiny mc. — Adam Gent, Feb 11 '11 at 12:52

score 11 · Accepted Answer · edited May 23 '17 at 12:34

11

You normally escape XSS during display, not during store. In JSP you can use the JSTL (just drop jstl-1.2.jar in /WEB-INF/lib) <c:out> tag or fn:escapeXml function for this. E.g.

<input name="foo" value="<c:out value="${param.foo}" />">

or

<input name="foo" value="${fn:escapeXml(param.foo)}">

That's it. If you do it during processing the input and/or storing in DB as well, then it's all spread over the business code and/or in the database. You should not do that, it's only maintenance trouble and you will risk double-escapes or more when you do it at different places (e.g. & would become &amp; instead of & so that the enduser would literally see & instead of & in view. The code and DB are not sensitive for XSS. Only the view is. You should then escape it only right there.

Update: you've posted 4 topics about the same subject:

I will only warn you: you do not need to escape it in servlet/filter/javacode/database/whatever. You're only unnecessarily overcomplicating things. Just escape it during display. That's all.

edited May 23 '17 at 12:34

Community

1
1

answered Feb 25 '10 at 12:26

BalusC

1,082,665
372
3,610
3,555

Yes maybe I should stop posting questions now... In terms of just dealing with output that is a conclusion I'm drawn to but I notice some sites advise you do output at the very least and then encode it on store over and above that. – AJM Feb 25 '10 at 13:13
Yup, but do you know of a fairly definite source that I could point someone who is in love with the idea of persisting encoded HTML to in order to change their minds!! – AJM Feb 25 '10 at 13:23
3

It's nothing more than logical. It makes data unportable. The data is tied to a specific view. The data is also unmaintainable. You don't know what was escaped and what was not. Taking social actions in social sites is also impossible. You can't tell from data whether the user has malicious intents or so. You should **never** change user input. Only escape it from maliciousness right at the moment the data is about to be processed. E.g. escape SQL injections at the moment the data is to be persisted (PreparedStatement) and escape XSS at the moment the data is to be shown in HTML (fn:escapeXml) – BalusC Feb 25 '10 at 13:40
2

-1, performing HTML entity encoding on tags is not sufficient to prevent XSS exploits. You should also use a white list of allowed characters. – Pool Feb 25 '10 at 13:45
2

See here for a list of vulnerabilities associated with specific characters: http://ha.ckers.org/charsets.html You're giving very poor advice and you're not even aware of it. – Pool Feb 25 '10 at 16:11
2

@BalusC I have to say I agree with @Pool on this. I also don't agree with the logic that you should never have any sort of policy to block input or change user input (never?? really). – Adam Gent Feb 11 '11 at 12:59
@Adam: if the intent is to redisplay user-submitted data unescaped as HTML (which clearly is the case, based on your RTE comment on the question), then you should rather use a [whitelist](http://stackoverflow.com/questions/4206850/how-to-prevent-javascript-injection-xss-when-jstl-escapexml-is-false/4207292#4207292). This intent is however different from preventing to display *any* user-submitted data unescaped as HTML as stated in the question. – BalusC Feb 11 '11 at 13:07
2

@BalusC escaping still does not prevent XSS and santizing on input will prevent *any user submitted data* from doing bad things. Not wanting to sanitize on input is much more of fringe case and you can deal with exception (perhaps with annotation). The other reason you don't want malicious data is if you provide any sort REST API to the Internet. You may do the right thing on output but your mashup partners may not. – Adam Gent Feb 11 '11 at 13:40
@Adam: The question concerns a JSP site, not a REST API. Not sanitizing output is simply a poor practice. Sanitizing input is unnecessary when output is sanitized. – BalusC Feb 11 '11 at 13:52
2

@BalusC I think this boils down to an opinion. I think my way scales better and I can guarantee with much better certainty that I will never have XSS even if I miss a c:out. It seems you would rather trade safety over principle and purity. I would not. I mean which is more embarrasing to the end user pulling out data they should not have been entering or a completely broken site because of XSS. – Adam Gent Feb 11 '11 at 15:12

futtta · Answer 2 · 2010-02-25T12:30:14.353

5

not built-in, but check out the owasp esapi filter, it should do what you're looking for and more. It is a great open source security library written by the smart guys&girls at Owasp ("Open Web Application Security Project").

edited Feb 25 '10 at 12:30

answered Feb 25 '10 at 12:24

futtta

5,917
2
21
33

That link is dead. Maybe https://owasp.org/www-project-enterprise-security-api/ will help – UglyBlueCat Jul 25 '22 at 13:55

Adam Gent · Answer 3 · 2011-02-11T13:49:24.053

I have to say I rather disagree with the accepted answer of apparently escaping on output to prevent XSS.

I believe the better approach is to sanitize on input which can easily be achieved with an aspect so that you don't have to put it all over the place. Sanitizing is different than escaping.

You can't just blindly escape:

You may want users to enter a subset of HTML (aka links and bold tags).
Escaping does not prevent XSS

I recommend using OWASP Antisammy library with an Aspect or @futtta's recommendation of the filter.

Below is an aspect I wrote to sanitize user input using Spring MVC annotations (since we use that for all of our input).

@SuppressWarnings("unused")
@Aspect
public class UserInputSanitizerAdivsor {

    @Around("execution(@RequestMapping * * (..))")
    public Object check(final ProceedingJoinPoint jp) throws Throwable {
        Object[] args = jp.getArgs();
        if (args != null) {
            for (int i = 0; i < args.length; i++) {
                Object o = args[i];
                if (o != null && o instanceof String) {
                    String s = (String) o;
                    args[i] = UserInputSanitizer.sanitize(s);
                }
            }
        }
        return jp.proceed(args);
    }
}

You will still have to escape on output for non rich-text fields but you will never (and I believe should never) have malicious data in your database.

If you don't want to sanitize on certain inputs you can always make annotation that will make the aspect not sanitize.

The other reason you don't want malicious data in your database is if you provide any sort REST API to the Internet. You may do the right thing on output but your mashup partners may not.

Sanitizing input or blocking input is ok (I mean most people have file upload limit right?). Most of the fields in a web application don't need script tags to be entered and more importantly most of your users probably do not need or want to enter script tags (obvious exception is stack overflow answers).

This simply goes against the general OWASP recommendation. Sanitizing input is problematic because it in many cases incorrectly sanitizes perfectly legal data because some characters *might* be harmful when displayed un-escaped. — PålOliver, Jan 20 '16 at 12:00
The reality is you need to do both particularly if you allow HTML input. Just because you sanitize doesn't mean you have to store it as well. You can just fail and tell the user why. — Adam Gent, Feb 04 '16 at 09:38
Can you give an example of a String that will execute after being escaped? I've tried a bunch and none excute in my JSP after escaping. — Philip Rego, Jan 20 '20 at 17:51

Java 5 HTML escaping To Prevent XSS

3 Answers3

Linked