I need to filter some illegal strings like "Password", but I found someone bypassed my check program. They input a string that seems exactly "Password" but it's not equal.
I checked the Unicode of it and, for example, the "a" is 8e61
, while normal "a" is 61
(hex).
My PHP files' encoding, HTML meta Content-Type and MySQL encoding are utf-8.
How does this happen? Why there're visually identical characters with different codes? I want to know how can I filter these characters. I put the weird string here, please copy it for research: Password
For some reason when I copied the "Password" with problem here, it actually displayed ASCII one.
I use PHP function bin2hex() on "Password", and get below:
50c28e61c28e73c28e73c28e776fc28e72c28e64c28e
while a normal one is:
50617373776f7264.
To make it simpler, the hexadecimal representation for "a" is:
c28e61
while normal one is:
61