I'm trying to create a function which removes all none English characters (except spaces,dots and hyphens) from a string. For this I tried using preg_replace, but the function produces strange results.
I have a file called "example-נידדל.jpg"
Here is what I'm getting when trying to sanitize the file name:
echo preg_replace('/[^A-Za-z0-9\.]/','','example-נידדל.jpg');
The above produces: example.jpg as expected.
But when I try to pull the file name from a $_FILES array after uploading it to the server I get:
echo preg_replace('/[^A-Za-z0-9\.]/','',$_FILES['file_upload']["name"]);
The above produces example-15041497149114911500.jpg
The numbers I'm getting are in fact the HTML numbers of the characters which were suppose to be removed, see the following for character reference: http://realdev1.realise.com/rossa/phoneme/listCharactors.asp?start=1488&stop=1785&rows=297&page=1
I can't figure out why doesn't the preg_replace work with file names.
Can anyone help?
Thanks,
Roy