5

I want to detect if a string contains characters that are not in the device's language characters

Is it possible?

Some of my app users write in arabic, the rest write in english. I need to translate text only when the text is in arabic and the user's device is in english or the other way around

124697
  • 22,097
  • 68
  • 188
  • 315
  • What do you mean by "the device's language"? – Harald Dec 21 '13 at 20:47
  • It sounds like this is not about charsets, but rather you are interested in whether a String contains characters from a particular script. You can do this with something like `string.matches(".*\\p{IsArabic}.*")`. – VGR Dec 21 '13 at 23:08
  • @VGR what is IsArabic? – 124697 Dec 22 '13 at 12:54
  • 1
    It is equivalent to `\\p{script=Arabic}` (which is also a valid regex). It matches a single character whose Unicode script designation is the [Arabic script](http://docs.oracle.com/javase/7/docs/api/java/lang/Character.UnicodeScript.html#ARABIC). The "Unicode Support" section of [the javadoc for java.util.regex.Pattern](http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html) has a full description of permitted \\p{…} expressions. – VGR Dec 22 '13 at 14:26

1 Answers1

2

You can get the device language by

Locale.getDefault().getDisplayLanguage();

And then do a checking on the input string to see if any character in the string is in the range between \u0600 and \u06FF (Arabic charset in Unicode), then it should do the trick

Here is the answer of how to check if the string is in a specific charset

public boolean isEncoded(String text){
    Charset charset = Charset.forName("US-ASCII");
    String checked=new String(text.getBytes(charset),charset);
    return !checked.equals(text);
}
Community
  • 1
  • 1
albusshin
  • 3,930
  • 3
  • 29
  • 57
  • How do I check the range? – 124697 Dec 21 '13 at 20:56
  • I think your answer almost does what I need. except it always returns true even if the mobile's current language is in arabic. I need to set the Charset.forName dynamically to whatever the name is for the mobiles language. do you know how can that be obtained? – 124697 Dec 21 '13 at 21:23
  • Charset ysb = Charset.defaultCharset(); Charset charset = Charset.forName(ysb.displayName()); always returns false – 124697 Dec 21 '13 at 21:30
  • In this case I think there's nothing more to do than simply doing a loop inside specific range of Unicode to see if there's a character inside that range. – albusshin Dec 21 '13 at 21:32