3

Possible Duplicate:
Capitalization of Person names in programming

I've noticed that people who are registering on my site are exceptionally lazy in the way that they don't even bother capitalizing their own names.

My site is a business oriented one, so there's no "freedom of self-expression" argument here.

First name capitalization is pretty easy as I can't think of a single instance where a Western first name would not start with a capital letter. I could be wrong.

But capitalizing the last name gets more difficult, with names like

O'Brien
O´Flaherty
de Wit
McKenzie
Auditore da Firenze
de los Remedios de Escalada
Virta-Jokela

What would be a good solution to proper automatic capitalization of surnames in PHP that would get it right some 95% of the time? I am aware of this.

Community
  • 1
  • 1
Emphram Stavanger
  • 4,158
  • 9
  • 35
  • 63
  • you are aware you shouldn't, but you want to any way? –  May 14 '12 at 09:24
  • I'm not the one who makes the decisions. :P – Emphram Stavanger May 14 '12 at 09:25
  • 1
    Let your users have a say at the very least on how they see their names on the website :( – Andreas Wong May 14 '12 at 09:28
  • 2
    What do you mean by get it right 95% of the time? Do you want 95% of the surnames in your database to be correct. One approach could be to check if a surname contains any capitals. If so, you can assume the user used appropriate capitalization of his own surname. If not, just capitalize the first letter. – Mischa May 14 '12 at 09:30
  • 2
    This bun fight has been had over and over on SO, please don't add to it. Do it if you must, but don't ask here and add to the pointless debating. –  May 14 '12 at 09:30
  • khaled: I have separate fields for the names, so a string that would be processed would contain only the last name. || NiftyDude: I'm hoping I can convince higher-ups to allow me to include a checkbox that would allow users to turn off automatic capitalization :) – Emphram Stavanger May 14 '12 at 09:31
  • you can make a name type list and then write regex for each type of name, and keep on adding new regex whenever you come across new names format.sample regex for handling one such type can be found here. http://stackoverflow.com/questions/275160/regex-for-names – Sunil Kartikey May 14 '12 at 09:35
  • `McDonald !== MacDonald !== Macdonald` – TRiG May 14 '12 at 09:38
  • Are you aware of [ucwords](http://php.net/ucwords) ? – Shiplu Mokaddim May 14 '12 at 09:39
  • Refer http://stackoverflow.com/questions/2466706/capitalization-of-person-names-in-programming – piyush May 14 '12 at 09:42
  • 5
    *"allow me to include a checkbox that would allow users to turn off automatic capitalization"* - Please don't do this. Who would check that box? I would be frustrated just by the existence of such a checkbox! Don't waste your user's time by including stupid form fields, it's just going to scare them away. Do The Right Thing and don't mess with the capitalization. Use some social skills and write a simple text explaining to the user that the name they enter is what other users will see, and hope they themselves have the good judgement to pick a sane capitalization. – Emil Vikström May 14 '12 at 09:55
  • this [link](http://stackoverflow.com/questions/8735798/make-first-letter-uppercase-and-the-rest-lowercase-in-a-string) give partial solution – khaled_webdev May 14 '12 at 10:49
  • The fact that so many people are giving so many duplicate links multiple times over in the same comment thread speaks volumes... – BoltClock May 14 '12 at 10:49
  • Try [NameCase](https://github.com/tamtamchik/namecase) – webaholik Feb 13 '19 at 14:15

3 Answers3

5

Here's a quick and dirty solution off the top:

  1. Split the string into words separated by whitespace and dash
  2. For each word:
    • If it's inside a fixed list of stop words ("de", "los", etc), do not modify.
    • If not, check if it has a prefix in a fixed list (this list would contain things like "O'", "Mc", etc). If such a prefix exists then normalize it (e.g. translating O" to O') and move to the next step considering the word without the prefix.
    • Uppercase the first letter of the word.
Jon
  • 428,835
  • 81
  • 738
  • 806
1

At first it seems like an easy job: simply capitalize every word you encounter in the last name. So foo bar will become Foo Bar.

However, as you already pointed out, there are exceptions:

  • de Wit
  • Auditore da Firenze
  • de los Remedios

This can be solved with a blacklist of fragments you don't want capitalized ('de', 'da', 'de los' given this example). But then you falsely assume that 'De', 'Da' and 'De Los' do not exist as (parts of) last names that should be capitalized.

So simply said: no, this can't be done good, only half-wittedly.

CodeCaster
  • 147,647
  • 23
  • 218
  • 272
-1

It make the first character of each letter capital $name = ucwords(strtolower($name));

This will be accurate for more than 95% cases.

It is not clear, how do you want de los Remedios de Escalada to be ? With above expression it will become De Los Remedios De Escalada. I am notsure if it is desired.

Another way is to explode the name by " " and make the first character of Last word capital using ucword()

piyush
  • 976
  • 4
  • 13
  • 28