1

Is there any way to replace any foreign characters for example: ã, ä to a, Ĉ, ć to c etc. I mean to leave simple letters like a-Z, without any additional things.

sarnold
  • 102,305
  • 22
  • 181
  • 238
Wojciech Kulik
  • 7,823
  • 6
  • 41
  • 67

1 Answers1

3

You can do this regular expression, if regexps are available to you:

str = str.replaceAll("[^a-zA-Z]", ""); //Assuming it to be a Java String

If you wish to normalize your text, however, you should do as the accepted answer for this question suggests: Remove diacritical marks (ń ǹ ň ñ ṅ ņ ṇ ṋ ṉ ̈ ɲ ƞ ᶇ ɳ ȵ) from Unicode chars

If you need to achieve the same thing in PHP you can write:

echo iconv('UTF-8', 'US-ASCII//TRANSLIT', 'asdaśćż,ąółwe,ÄĄ;ú');
Community
  • 1
  • 1
Milad Naseri
  • 4,053
  • 1
  • 27
  • 39