0

I am writing an online form which I am checking client-side through JavaScript and server-side through PHP though was wondering, should I be validating that the user doesn't include non-ASCII characters?

For instance, is there any way that programming languages would throw errors or give me any problems if I don't handle these characters - or should they be capable of dealing with them?

Thanks in advance.

  • Depends on the form, doesn't it? Which reason would it have to restrict the charset other than improper encoding handling? – mario Jul 08 '16 at 22:48
  • 1
    See also: [How to support UTF-8 completely in a web application](http://stackoverflow.com/q/279170) – mario Jul 08 '16 at 22:48
  • In this case, it is just a registration form - though I was wondering is it possible that someone could use special characters to purposefully crash a system by causing such languages to throw errors when met with unknown characters? – harryjamesuk Jul 08 '16 at 22:50
  • @harryjamesuk I would worry more about delimiters like `'` or `"`. They can cause more trouble. `a` however should be as harmless as ``. – Sebastian Simon Jul 08 '16 at 22:55
  • @Xufox Thanks for your reply - that's what I was looking for :) – harryjamesuk Jul 08 '16 at 22:57
  • 1
    @Xufox "I would worry more about delimiters like ' or "." --- that is such a harmful statement for a newbie. – zerkms Jul 09 '16 at 01:21
  • 1
    @harryjamesuk what they suggested in general is harmful. There is nothing wrong in quote characters. You should worry about how you use **any** data, not any particular characters. – zerkms Jul 09 '16 at 01:21

1 Answers1

3

Don't limit your users to ASCII unless there is a good reason to do so. UTF-8 encoded Unicode is supported by almost everything, so stick with that unless there's a good reason not to.

If you're not familiar with character encoding, I'd recommend this article: http://www.joelonsoftware.com/articles/Unicode.html

bsa
  • 2,671
  • 21
  • 31