Hey I want to sanitize a string and only allow it to have a-z A-Z (also other languates, not only english) and , I tried doing the ReplaceAll([^a-z 0-9,])
but it is deleting other languates.. can someone show me how can I manage to only sanitize special characters and also won't delete emojis from it?

- 449
- 1
- 7
- 11
-
`ReplaceAll()` isn't even a PHP function... – Naruto Feb 07 '17 at 15:29
-
Use: `str.replaceAll("[^\\p{L}]+", "");` – anubhava Feb 07 '17 at 15:33
-
Can you clarify what `but it is deleting other languages` means? – Salem Feb 07 '17 at 15:36
3 Answers
You could try getting the a-z and 0-9 characters' ASCII code, and if the current character is not one of them, do what you wish. On how to get the ascii value of a character, refer here.
EDIT: the idea is that a-z and 0-9 the characters are next to each other. So just write a simple function that returns a boolean
whether your current character is one of these, and if not, replace.
For this though, you will have to replace one by one.
I've tested this regular expression and AFAIK it works...
String result = yourString.replaceAll("[^a-zA-Z0-9]", "");
It replaces any character that isn't in the set a-z, A-Z, or 0-9 with nothing.

- 2,876
- 19
- 26
In java you can do
yourString.replaceAll("[^\\p{L}\\p{Nd}]+", "");
The regular expression [^\p{L}\p{Nd}]+
match all characters that are no a unicode letter or a decimal number.
If you need only characters (not numbers) you can use the regular expression [^\\p{L}]+
as follow:
yourString.replaceAll("[^\\p{L}]+", "");

- 26,420
- 4
- 39
- 56