4

I have written the function below. It converts lower caser to upper case and proper case. I want it to ignore foreign characters. eg. ñ

Expected result: Sabiña/Cerca

Actual Result: SabiÑA/Cerca

NOTE: if I use mb_convert_case alone it does not change any character after/ to proper case.

$string= 'SABIÑA CERCA';

echo  preg_replace_callback('/\w+/i', 

create_function('$m','

var_dump($m);
if(strlen($m[0]) > 3)
{
    return mb_convert_case($m[0], MB_CASE_TITLE, "UTF-8");
}
else
{
    return ucfirst($m[0]);
}')
, $string);
user2635901
  • 203
  • 2
  • 12
  • 1
    As an aside, using `create_function('$m', '` is a bit old school since now you can write `function ($m) {...}`. Other thing, since you are dealing with multibyte strings you need to use `mb_strlen` to obtain the number of characters (and not the number of bytes that returns `strlen`). To finish, even with the corrected pattern, your code will not work with a word that starts with an accented letter. To solve the problem, see the different suggestions to emulate `mb_ucfirst` in the php manual. – Casimir et Hippolyte Feb 26 '16 at 20:33

1 Answers1

2

You just need to use the /u modifier.

'/\w+/u'

See IDEONE demo

Note that the /i case insensitive modifier is redundant because \w matches both lower- and uppercase letters.

See Pattern modifiers:

This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern and subject strings are treated as UTF-8. This modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563