I'm trying to make a method to detect strings written in right to left languages in Java. I've come up with this question doing something similar in C#.
Now I need to have something like that but written in Java.
Any help is appreciated.
Asked
Active
Viewed 6,162 times
11
4 Answers
13
I came up with the following code:
char[] chars = s.toCharArray();
for(char c: chars){
if(c >= 0x600 && c <= 0x6ff){
//Text contains RTL character
break;
}
}
It's not a very efficient or for that matter an accurate way but can give one ideas.

2hamed
- 8,719
- 13
- 69
- 112
-
15You should use (c >= 0x5D0 && c <= 0x6ff) to include Hebrew, which is also an RTL language. – Ron Tesler Feb 25 '14 at 08:51
13
Question is old but maybe someone else might have the same problem...
After trying several solutions I found the one that works for me:
if (Character.getDirectionality(string.charAt(0)) == Character.DIRECTIONALITY_RIGHT_TO_LEFT
|| Character.getDirectionality(string.charAt(0)) == Character.DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
|| Character.getDirectionality(string.charAt(0)) == Character.DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
|| Character.getDirectionality(string.charAt(0)) == Character.DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
) {
// it is a RTL string
}

Dark
- 864
- 9
- 17
-
-
2@Liggliluff there is no mark, the detection is used directly on chars, `Character.getDirectionality(char)` – cdalxndr Jul 29 '21 at 17:06
9
Here's improved version of Darko's answer:
public static boolean isRtl(String string) {
if (string == null) {
return false;
}
for (int i = 0, n = string.length(); i < n; ++i) {
byte d = Character.getDirectionality(string.charAt(i));
switch (d) {
case DIRECTIONALITY_RIGHT_TO_LEFT:
case DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC:
case DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING:
case DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE:
return true;
case DIRECTIONALITY_LEFT_TO_RIGHT:
case DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING:
case DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE:
return false;
}
}
return false;
}
This code works for me for all of the following cases:
בוקר טוב => true
good morning בוקר טוב => false
בוקר טוב good morning => true
good בוקר טוב morning => false
בוקר good morning טוב => true
(בוקר טוב) => true

Oleksii K.
- 5,359
- 6
- 44
- 72
0
Maybe this should help:
http://en.wikipedia.org/wiki/Right-to-left_mark
There should be a Unicode char, namely U+200F, when a rtl string is present.
Regards

hellow
- 12,430
- 7
- 56
- 79