4

Possible Duplicate:
Java: length of string when using unicode overline to display square roots?

How do I get number of Unicode characters in a String?

Given a char[] of Thai characters:

[อ, ภ, ิ, ช, า, ต, ิ]

This comes out in String as: อภิชาติ

String.length() returns 7. I understand there are (technically) 7 characters, but I need a method that would return me 5. That is the exact number of character spaces represented on screen.

Community
  • 1
  • 1
datacrush
  • 43
  • 1
  • 4
  • 2
    You could have a look at http://stackoverflow.com/questions/7704426/java-length-of-string-when-using-unicode-overline-to-display-square-roots – MadProgrammer Oct 05 '12 at 04:53
  • This maybe of some help to you. [count chars in a unicode string][1] [1]: http://stackoverflow.com/questions/7298059/how-to-count-characters-in-a-unicode-string-in-c – Mukul Goel Oct 05 '12 at 04:58
  • @Mukul, the link you offered is explicitly in C, not Java (to which this question refers). – jrd1 Oct 05 '12 at 05:03

3 Answers3

5

Seems you just want to not count the unicode marks as separate characters;

static boolean isMark(char ch)
{
    int type = Character.getType(ch);
    return type == Character.NON_SPACING_MARK ||
           type == Character.ENCLOSING_MARK ||
           type == Character.COMBINING_SPACING_MARK;
}

which can be used as;

String olle = "อภิชาติ";
int count = 0;

for(int i=0; i<olle.length(); i++)
{
    if(!isMark(olle.charAt(i)))
        count++;
}

System.out.println(count);

and returns '5'.

Joachim Isaksson
  • 176,943
  • 25
  • 281
  • 294
1

You can adapt the solution posted to this question here:

Unicode to string conversion in Java

By stripping the '#' character and counting the remaining characters in the string.

Community
  • 1
  • 1
jrd1
  • 10,358
  • 4
  • 34
  • 51
0

You can use a java.text.BreakIterator to find the gaps between the graphemes ("visual characters") and count them. Here's an example:

import java.text.BreakIterator;

..

int graphemeLength(String str) {
    BreakIterator iter = BreakIterator.getCharacterInstance();
    iter.setText(str);

    int count = 0;
    while (iter.next() != BreakIterator.DONE) count++;

    return count;
}

Now graphemeLength("อภิชาติ") will return 5.

Joni
  • 108,737
  • 14
  • 143
  • 193