0

I have a string of purposefully bad data. I am attempting to strip out anything that is not basically something a user with a standard English keyboard would type, and also throw in a few spanish letters too for kicks:

$string = "ó�Ⲃⲟⲟⲉⲁⲛ ⲁⲛⲇ ⲒⲛϯⲉⲉꞅHôpitüD�sseldor ";
$re = '/[^\A-Za-z0-9@\.\' ;<>,-_\|!@#+=\[\]{}$%^&:*()"ñáéíóú]/mu';
$string = preg_replace($re, '', $string);

According to regex101, I should be getting back this as my result:

ó  HpitDsseldor

but instead I get back this:

ó???????? ??? ????????HpitD?sseldor 

What is causing all of these ? to remain in the cleaned output?

Brian Powell
  • 3,336
  • 4
  • 34
  • 60

0 Answers0