I am seeing the following unexpected behavior in PHPs preg_replace()
:
Running
echo("'".preg_replace('#\h\h+#', ' ', 'à')."'\n");
echo("'".preg_replace('#\h\h+#', ' ', 'à ')."'\n");
echo("'".preg_replace('#\h\h+#', ' ', 'á')."'\n");
echo("'".preg_replace('#\h\h+#', ' ', 'á ')."'\n");
echo("'".preg_replace('#\h+#', ' ', 'à ')."'\n");
echo("'".preg_replace('#\h#', ' ', 'à ')."'\n");
yields the output
'à'
'Ã '
'á'
'á '
'Ã '
'Ã '
Why is the 'à' in some cases replaced by 'Ã '?
Shouldn't my regular expression only replace certain occurrence of horizontal white space characters with a single space and leave all other characters alone?