3

I would like to capitalize names properly, which in this case means:

  • The first letter is capitalized.
  • The first letter after a space is capitalized ('Van Helsing', not 'Van helsing')
  • The first letter after a dash is capitalized ('Johnson-Smith', not 'Johnson-smith')
  • No other letters are capitalized.

The first and last requirements are easily handled:

CONCAT(LEFT(name, 1), LOWER(RIGHT(name, LENGTH(name) - 1)))

The others are harder. I've written a 54-times-nested REPLACE statement (not by hand of course, I used Excel)

REPLACE(REPLACE(REPLACE(REPLACE(...,' b',' B'),'-b','-B'),' a',' A'),'-a','-A')

but I feel like there must be a more elegant and maintainable solution. Any ideas?

If there's a built-in function that is similar but not identical to my requirements that would probably be fine.


Edit: This script will only run on names that have lost their capitalization already, so there's no danger in mis-handling obscure names. Handling apostrophes like spaces and dashes would be nice, though. A glance through the current data shows that many (~30%) of the names have at least one of [-' ].

Charles
  • 11,269
  • 13
  • 67
  • 105
  • 1
    The 'more elegant solution' is to store the data in the format you want to begin with. Or use a script to pull it out and reformat. SQL really isn't the right tool for this. – Matthew Smith Oct 22 '10 at 14:09
  • I may have to pull out the data and use Perl, but if there was a good method in SQL I'd prefer to use it. – Charles Oct 22 '10 at 14:12

5 Answers5

4

What about "Van den Berg" (which has a lower case 'd'). What about "McDonald" or "O'Reilly".

It's generally considered a bad idea to over-validate people's names, because there's always going to be a legitimate name which breaks your validation.

See also my answer to a previous similar question here: How to "Validate" Human Names in CakePHP?

Community
  • 1
  • 1
Spudley
  • 166,037
  • 39
  • 233
  • 307
  • It would be acceptable for "Van den Berg" to be changed to "Van Den Berg". This is not a script that will run on all names but one which will run on names which are presently in all capital letters (for various reasons). We *will* need to have a secretary or the like review the names, but a script like this should cut down the work required dramatically. – Charles Oct 22 '10 at 14:11
  • 1
    My fault -- I should have specified that initially. I've edited it into the question now. – Charles Oct 22 '10 at 14:19
  • @Charles: No worries. Given that, I've commented on one of the answers which I think will lead you in the right direction. MySQL doesn't have a function to do what you want, but you could write one (or more likely find one that's been written by someone else). – Spudley Oct 22 '10 at 14:29
2

The code frustratedwithforms posted does not work correctly...it correctly capitalizes the first and last words in the string and deletes anything in between (if there are > 2). Someone posted a corrected version at the URL he posted...

(See Matt Cavanaugh's post on May 15 2009 3:52pm at: http://dev.mysql.com/doc/refman/5.1/en/string-functions.html)

1

It won't be pretty, but you can used SUBSTRING_INDEX to find the first space and dash, and conditionally capitalise whatever succeeds it.

Marcelo Cantos
  • 181,030
  • 38
  • 327
  • 365
1

You could probably use an User Defined Function, that would be much easier to reuse.

David Mårtensson
  • 7,550
  • 4
  • 31
  • 47
  • 1
    Since MySQL doesn't support REGEXP_REPLACE, see this question http://stackoverflow.com/questions/986826/how-to-do-a-regular-expression-replace-in-mysql for a very similar answer, with a useful-looking link to get you started. – Spudley Oct 22 '10 at 14:17
1

It looks like MySQL doesn't have an INITCAP function, but I found code for one here:

http://dev.mysql.com/doc/refman/5.1/en/string-functions.html

And the code too:

DELIMITER $$
DROP FUNCTION IF EXISTS `initcap` $$
CREATE FUNCTION `initcap`(x char(30)) RETURNS char(30) CHARSET utf8
BEGIN
SET @str='';
SET @l_str='';
WHILE x REGEXP ' ' DO
SELECT SUBSTRING_INDEX(x, ' ', 1) INTO @l_str;
SELECT SUBSTRING(x, LOCATE(' ', x)+1) INTO x;
SELECT CONCAT(@str, ' ', CONCAT(UPPER(SUBSTRING(@l_str,1,1)),LOWER(SUBSTRING(@l_str,2)))) INTO @str;
END WHILE;
RETURN LTRIM(CONCAT(@str, ' ', CONCAT(UPPER(SUBSTRING(x,1,1)),LOWER(SUBSTRING(x,2)))));
END $$
DELIMITER ;

Disclaimer: I didn't write this code, I haven't even tested it...

FrustratedWithFormsDesigner
  • 26,726
  • 31
  • 139
  • 202