How to use AntiXss with a Web API

Question

This is a question that has been asked before, but I've not found the information I'm looking for or maybe I'm just missing the point so please bear with me. I can always adjust my question if I'm asking it the wrong way.

If for example, I have a POST endpoint that use a simply DTO object with 2 properties (i.e. companyRequestDto) and contains a script tag in one of its properties. When I call my endpoint from Postman I use the following:

{
   "company": "My Company<script>alert(1);</script>",
   "description": "This is a description"
}

When it is received by the action in my endpoint,

public void Post(CompanyRequestDto companyRequestDto)

my DTO object will automatically be set and its properties will be set to:

companyDto.Company = "My Brand<script>alert(1);</script>";
companyDto.Description = "This is a description";

I clearly don't want this information to be stored in our database as is, nor do I want it stored as an escaped string as displayed above.

1) Request: So my first question is how do I throw an error if the DTO posted contains some invalid content such as the tag?

I've looked at Microsoft AntiXss but I don't understand how to handle this as the data provided in the properties of a DTO object is not an html string but just a string, so What I am missing here as I don't understand how this is helping sanitizing or validating the passed data.

When I call

var test = AntiXss.AntiXssEncoder.HtmlEncode(companyRequestDto.Company, true);

It returns an encoded string, but then what??

Is there a way to remove disallowed keywords or just simply throw an error?

2) Response: Assuming 1) was not implemented or didn't work properly and it ended up being stored in our database, am I suppose to return encoded data as a json string, so instead of returning:

"My company"

Am I suppose to return:

"My Company&lt;script&gt;alert(1)&lt;/script&gt;"

Is the browser (or whatever app) just supposed to display as below then?:

"My Company&lt;script&gt;alert(1)&lt;/script&gt;"

3) Code: Assuming there is a way to sanitize or throw an error, should I use this at the property level using attribute on all the properties of my various DTO objects or is there a way to apply this at the class level using an attribute that will validate and/or sanitize all string properties of a DTO object for example?

I found interesting articles but none really answering my problems or I'm having other problems with some of the answers:

asp.net mvc What is the difference between AntiXss.HtmlEncode and HttpUtility.HtmlEncode?
Stopping XSS when using WebAPI (currently looking into this one but don't see how example is solving problem as property is always failing whether I use the script tag or not)
how to sanitize input data in web api using anti xss attack (also looking at this one but having a problem calling ReadFromStreamAsync from my project at work. Might be down to some of the settings in my web.config but haven't figured out why but it always seems to return an empty string)

Thanks.

UPDATE 1:

I've just finished going through the answer from Stopping XSS when using WebAPI

This is probably the closest one to what I am looking for. Except I don't want to encode the data, as I don't want to store it in my database, so I'll see if I can figure out how to throw an error but I'm not sure what the condition will be. Maybe I should just look for characters such as <, >, ; , etc... as these will not likely be used in any of our fields.

Possible duplicate of https://softwareengineering.stackexchange.com/questions/117512/should-i-html-encode-all-output-from-my-api?rq=1 — Michael Freidgeim, May 03 '21 at 22:55

score 2 · Answer 1 · answered Nov 10 '19 at 21:28

2

You need to consider where your data will be used when you think about encoding, so that data with in it is only a problem if it's rendered as HTML so if you are going to display data that has been provided by users anywhere, it's probably at the point you are going to display it that you would want to html encode it for display (you want to avoid repeatedly html encoding the same string when saving it for example).
Again, it depends what the response is going to be used for... you probably want to html encode it at the point it's going to be displayed... remember if you are encoding something in the response it may not match whats in data so if the calling code could do something like call your API to search for a company with that name that could cause problems. If the browser does display the html encoded version it might look ugly but it's better than users being compromised by XSS attacks.
It's quite difficult to sanitize text for things like tags if you allow most characters for normal use. It's easier if you can whitelist characters allowed and only allow, say, alphanumeric but that isn't often possible. This can be done using a regex validation attribute on the DTO object. The best approach I think is to encode values for display if you can't stop certain characters. It's really difficult to try to allow all characters but avoid things like as people can start using ascii characters etc.

answered Nov 10 '19 at 21:28

Phil

103
6

Regarding 1, the AntiXss is used in the client (ASP.NET MVC Web App) but PEN test was just applied on API and they reported this as an issue. I thought it would be up to the client to process this accordingly but it has been recommended to us, which I think is fair that we should also protect against such attacks at the API level to ensure that the data is not at least saved into our database containing. I will update my answer in a sec with some findings based on what of the SA question I'm looking at. – Thierry Nov 10 '19 at 21:35
interesting... can you whitelist the characters that you can allow for your fields? it's much easier problem to solve if you just don't allow things into your database – Phil Nov 10 '19 at 21:42
@Thierry, have a look at the answer by Sahin Dohan on this post, looks like Sahin found htmlsanitizer to be better... https://stackoverflow.com/questions/12618432/stopping-xss-when-using-webapi – Phil Nov 10 '19 at 21:45
I've already sent him a question as I was having a problem with his answer. I'll look at it again, but my problem is that I'm not dealing with html but with json, so I'm not sure how using htmlsanitizer will fit in order to ensure that the provided json is sanitized or an error is thrown. I'm more and more incline to go towards to include basic validation on my modelState validation attribute where I will just reject characters such as <, >, />, (, ), ; if detected. Might be very basic and I'll check with the business tomorrow as it might suffice. – Thierry Nov 10 '19 at 22:02
1

Regarding your point 2), this is why I think I would prefer throwing an error if any of these invalid characters are detected as I don't want encoded data to be stored in our database. Just wouldn't make sense. – Thierry Nov 10 '19 at 22:05

How to use AntiXss with a Web API

1 Answers1