2

Does anyone has a PHP solution to this?

The goal is to have a function that take these

HELLO WORLD hello world Hello IBM

and return these

Hello World Hello World Hello IBM

respectively.

Average Joe
  • 4,521
  • 9
  • 53
  • 81
  • 2
    no silver bullet, there where always be exceptions in such circumstances - so what are you doing in the first place? –  May 28 '12 at 00:27
  • See also: [Capitalization of person names in programming](http://stackoverflow.com/questions/2466706/capitalization-of-person-names-in-programming/). – Jonathan Leffler May 28 '12 at 00:42

2 Answers2

3

Mr MacDonald from Scotland prefers his name capitalized that way, while Mr Macdonald from Ireland prefers it thus. It is kinda hard to know which is 'correct' without knowing in advance which gentleman you are referring to, which takes more context than just the words in the file.

Also, the BBC (or is that the Bbc?) has taken to spelling some names like Nasa and Nato. It jars on me; I dislike it intensely. But that's what they do these days. When does an acrynom (or 'initialism' as some prefer to call it) become a word in its own right?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • I hear you but hey there is nothing you and I can do to it. but when we are up against a challenge of moving data from one place to another and you have the goal of cleaning up that data as much as you can, you may assume that if a word is originally written with some upper case letters in it, there is probably a good reason for it so we don't touch it, we simply skip over it. In this case, NASA, McDonald and Mcdonald all will make it to the other end as is. The question is what's the regex for that? – Average Joe May 28 '12 at 01:03
2

Tho this is a bit of a hack, you could store a list of acronyms that you want to keep uppercase and then compare the words within the string against the list of $exceptions. While Jonathan is correct, if its names your working with and not acronyms then this solution is useless. but obviously if Mr MacDonald from Scotland is in the correct case then it wont change.

See it in action

<?php
$exceptions = array("to", "a", "the", "of", "by", "and","on","those","with",
                    "NASA","FBI","BBC","IBM","TV");

$string = "While McBeth and Mr MacDonald from Scotland
was using her IBM computer to watch a ripped tv show from the BBC,
she was being watched by the FBI, Those little rascals were
using a NASA satellite to spy on her.";

echo titleCase($string, $exceptions);
/*
While McBeth and Mr MacDonald from Scotland
was using her IBM computer to watch a ripped TV show from the BBC,
she was being watched by the FBI, Those little rascals were
using a NASA satellite to spy on her.
*/

/*Your case example
  Hello World Hello World Hello IBM, BBC and NASA.
*/
echo titleCase('HELLO WORLD hello world Hello IBM, BBC and NASA.', $exceptions,true);


function titleCase($string, $exceptions = array(), $ucfirst=false) {
    $words = explode(' ', $string);
    $newwords = array();
    $i=0;
    foreach ($words as $word){
        // trim white space or newlines from string
        $word=trim($word);
        // trim ending coomer if any
        if (in_array(strtoupper(trim($word,',.')), $exceptions)){
            // check exceptions list for any words that should be in upper case
            $word = strtoupper($word);
        } else{
            // convert to uppercase if $ucfirst = true
            if($ucfirst==true){
                // check exceptions list for should not be upper case
                if(!in_array(trim($word,','), $exceptions)){
                    $word = strtolower($word);
                    $word = ucfirst($word);
                }
            }
        }
        // upper case the first word in the string
        if($i==0){$word = ucfirst($word);}
        array_push($newwords, $word);
        $i++;
    }
    $string = join(' ', $newwords);
return $string;
}
?>
Lawrence Cherone
  • 46,049
  • 7
  • 62
  • 106
  • THanks for the function Lawrence but, I do not have a list like that. It's a hot list, it's dynamic. How about processing one word at a time, and skipping over those words if the word contains an upper case starting from its 2nd character and on. What's the regex for that? This approach would turn all the words to title case (without modifying words such as IBM, McDonald, WordPress and etc. – Average Joe May 28 '12 at 01:35
  • Well for any comparison you would need some sort of reference. – Lawrence Cherone May 28 '12 at 01:37