1

A html page has a text box to enter Employee Name, another text box to input Employee Age and a Save button that when clicked calls a Web API method of SaveEmployeeData to save data. The Web API is hosted in an asp.net website and all its methods are written in C#.

Question

How would I constrain the end-user to not enter any html or script into the Employee Name and Employee Age text boxes in this situation? I was looking for some attribute that I could apply to these properties in code below. And even if they did input such text, the Web API should respond with validation errors.

//Web API method below
public HttpResponseMessage SaveEmployeeData(EmployeeDetails ed)
{
   //code omitted
}

//Type of parameter passed to above Web API method
public class EmployeeDetails
{
    [Required]
    [StringLength(1000,MinimumLength=10)]
    public string FullName { get; set; }
    public int Age { get; set; }

}

UPDATE 1

I tried the regular expression suggested by samir, but it appears to not allow even simple alphabet input as shown in screen shot below. The url for this online regex tester is: http://regex.cryer.info/. So think another regular expression needs to be used in this case for Employee Name value.

RegEx not Working

UPDATE 2

I was able to get the suggested regular expression suggested by samir to work.

The code change I made for allowing alphabets ( any language), digits, single apostrophe, period and dash in my situation is as below. It's the regular expression attribute that I applied to Full Name property that made sure no html or script was submitted when calling the web api method of SaveEmployeeData

   //Type of parameter passed to above Web API method
    public class EmployeeDetails
    {
        [Required]
        [StringLength(1000,MinimumLength=10)]
        [RegularExpression(@"(^[\p{L} .'-(0-9)]+$)", ErrorMessage = "HTML or Script not allowed")]
        public string FullName { get; set; }
        public int Age { get; set; }

    }
Sunil
  • 20,653
  • 28
  • 112
  • 197

2 Answers2

1

I would suggest something robust as featured in this response.

A user typing malicious code is not any more likely than someone directly posting malicious input to your api. Yes, sanitize and control what the user inputs, but also sanitize what is sent to your api, then also sanitize and validate what is received by your api.

There are numerous ways to restrict the characters allowed in html text boxes. Check here, and here.

I'm less knowledgeable about the API side of things. I suggest you continue researching or hopefully someone else can expand on that.

Community
  • 1
  • 1
Adam Heeg
  • 1,704
  • 1
  • 14
  • 34
1

You can use regular expression at client / server side.

Following regex will validate Employee Name:

"^[\\p{L} .'-]+$"

where,

\\p{L} matches any kind of letter from any language

.'- allows space, dot, single quote and hyphen in Employee Name.

E.g.

Francisco D'Souza

Éric

André

For age you can use below regex:

"^(0?[1-9]|[1-9][0-9])$"

Example in PHP

<?php
$name = "Francisco D'Souza";

if (!preg_match("/^[\\p{L} .'-]+$/",$name)) {
    echo "INVALID"; 
}
else
{
    echo "VALID";
}
?>

OUTPUT

VALID // Francisco D'Souza

INVALID // <html>

Samir Selia
  • 7,007
  • 2
  • 11
  • 30
  • Excellent solution. Thanks. It works and since its server-side, its the best protection in this case. – Sunil Jul 30 '15 at 17:22
  • I used your regular expression but its always giving an error even when I input something like this: abcdef. I added this attribute to employee name property: [RegularExpression(@"^[\\p{L} .'-]+$", ErrorMessage="HTML or Script not allowed")] – Sunil Jul 30 '15 at 18:04
  • Though your idea is correct, the regular expression seems to be not correct. – Sunil Jul 30 '15 at 18:20
  • I do not have much knowledge of c# but have added a code snippet in PHP that works perfect. Please see my updated post. – Samir Selia Jul 30 '15 at 18:48
  • I will try making your regex work. May be there is some special syntax for using in Web API, but not sure. – Sunil Jul 30 '15 at 18:55
  • Your regex does not work as I have explained in UPDATE 1. Also, digits must be allowed like in 'Mike123' . If I can make your regex work I will mark it back as answer. – Sunil Jul 30 '15 at 19:20
  • I had to remove the the double black slash before 'p' and replace it by a single back slash. Also, I had to add allowance for digits. So the final expression that worked is: [RegularExpression(@"(^[\p{L} .'-(0-9)]+$)", ErrorMessage = "HTML or Script not allowed")] – Sunil Jul 30 '15 at 19:51
  • Would \p{L} also match digits in a non-english culture? I would like to match alphabets as well as digits in another non-english culture. – Sunil Aug 05 '15 at 17:26
  • 1
    `\p{L}` will not match digits. Try `\p{N}` for digits. Reference: http://www.regular-expressions.info/unicode.html – Samir Selia Aug 08 '15 at 07:02
  • Ok. Thanks. So, `\p{N}` will match both digits and letters – Sunil Aug 08 '15 at 15:42
  • No. `\p{L}` for letters and `\p{N}` for digits. Haven't tried non-english digits. Please try and share results with us. – Samir Selia Aug 08 '15 at 15:56