should i screen out odd characters from names

Question

From Personal names in a global application: What to store and How can I validate a name, middle name, and last name using regex in Java?

i have read that you can't really validate names because of international possibilities long names, multiple names, weird names. the general verdict is to avoid it and play safe instead - which means allowing all possible characters, combinations and just print it as html-safe mark-up.

but what about special characters? ~~Shift + "one to nine" series and others~~, should i just allow them to be placed in the database and "play safe" or should i screen them out?

i would also want users of my program to responsibly input names (though i can't guarantee that) but at least at some point there should be enforced rules but without totally locking out others who legitimately have a reason to use $ or @ in their names.

i'm on PHP and JS but same goes for all languages that use input validations

EDIT:

i do have to note, it does not really mean just the Shift 1-9. that's just what i call them. it also includes special characters outside the 1-9. sorry for the confusion.

here's the thing, my application is like a library application. a book has a title, an author, and a year. while the title and year may go to one table, the author i want listed to another table. these inputs are from the users. now i'm going to implement an autocomplete for the authors. but the data for autocomplete is based on the input of the users - the reliability of the autocomplete data will be based on the author inputs of the users.

just like facebook, how do they implement this? i haven't seen any friend using special characters, unlike those friendster times where everytime i search, people with numeric or special charactered names come up first - not really great for an autocomplete.

It would be great to name your kid M@rk! But I think it's safe to leave SHIFT 1-9 out actually. — Jules, Jan 12 '12 at 07:58
so if someone would place |)exter? this won't show up, especially if i'd use an autocomplete. — Joseph, Jan 12 '12 at 08:06
What do you gain by screening these characters out? If users want to enter a fake name, they will, whatever you screen out. And if they don't want, then they will not use these characters. They could do a typo, but so what? They'll fix it if they want. — JB Nizet, Jan 12 '12 at 08:07
Yes I agree on JB Nizet. If their name is Jeff and they enter Marc, you can't screen that out either anyway. So it really doesn't matter much. I wouldn't bet my life on it that all data ever entered in a form is legit. As long as you prevent yourself from SQL Injection and all other attacks you should be good. — Jules, Jan 12 '12 at 08:11
Also, UTF-8 covers hundreds of different characters, and there's a good chance that a whole lot never appear in any name in any language. Why screen out 9 of them and not all the others? — JB Nizet, Jan 12 '12 at 08:16
Can't you just let them enter what they want, but sort weird symbols after letters for things like autocomplete? — Carson Myers, Jan 12 '12 at 19:25

score 0 · Accepted Answer · answered Jan 12 '12 at 09:03

Shift + "one to nine" doesn’t really specify a set of characters, as it depends on the keyboard what such combinations produce. If you mean the characters in Shift positions of keys 0 to 9 in standard US keyboards, then I have to admit that I have never seen a person’s real name (as opposite to nicknames) with such characters. But I would not bet on their absolute absence from names. Yesterday, I learned that some orthography of the Venetian language uses “£” (pound sign) as a letter. Moreover, people might use easily available characters as replacements of characters they cannot easily produce on a keyboard, e.g. using “!” instead of “ǃ” (U+01C3 Latin letter retroflex click) or “e^” instead of “ê”.

The question is what you expect to gain by excluding some characters. To catch typos?

should i screen out odd characters from names

1 Answers1