I have a string of purposefully bad data. I am attempting to strip out anything that is not basically something a user with a standard English keyboard would type, and also throw in a few spanish letters too for kicks:
$string = "ó�Ⲃⲟⲟⲉⲁⲛ ⲁⲛⲇ ⲒⲛϯⲉⲉꞅHôpitüD�sseldor ";
$re = '/[^\A-Za-z0-9@\.\' ;<>,-_\|!@#+=\[\]{}$%^&:*()"ñáéíóú]/mu';
$string = preg_replace($re, '', $string);
According to regex101, I should be getting back this as my result:
ó HpitDsseldor
but instead I get back this:
ó???????? ??? ????????HpitD?sseldor
What is causing all of these ?
to remain in the cleaned output?