20

In JavaScript, .charCodeAt() returns a Unicode value at a certain point in the string which you pass to a function. If I only had one character, I could use the code below to get the Unicode value in Java.

public int charCodeAt(char c) {
     int x;
     return x = (int) c;
}

If I had a string in Java, how would I get the Unicode value of one individual character within the string, like the .charCodeAt() function does for JavaScript?

syb0rg
  • 8,057
  • 9
  • 41
  • 81

4 Answers4

23

Java has the same method: Character.codePointAt(CharSequence seq, int index);

String str = "Hello World";
int codePointAt0 = Character.codePointAt(str, 0);
jlordo
  • 37,490
  • 6
  • 58
  • 83
  • has it any performance difference than using `int value = str.charAt(index);` – exexzian Dec 31 '12 at 17:53
  • 4
    Yes, it's slower. But it works correct even for 4 byte characters, which consist of a high and low surrogate, whereas yours won't. You can always [look at the implementation](http://docjar.com/html/api/java/lang/Character.java.html). – jlordo Dec 31 '12 at 17:55
  • So your function would be better for encryption then, @jlordo? – syb0rg Dec 31 '12 at 17:58
  • You have to define _better_. All I'm saying is, it will return the correct codepoint for every character, not just the most. – jlordo Dec 31 '12 at 17:59
  • @jlordo yeah just read api docs about it + "your comment" helped me... +1 – exexzian Dec 31 '12 at 18:01
  • By better, I meant that any character could be handled. – syb0rg Dec 31 '12 at 18:03
  • 1
    Than, yes. It can handle every character. Read the docs. I've linked them ;) For surrogate pairs, you have to specify the index of the high surrogate, though. – jlordo Dec 31 '12 at 18:05
0

Try this:

public int charCodeAt(String string, int index) {
    return (int) string.charAt(index);
}
syb0rg
  • 8,057
  • 9
  • 41
  • 81
Aadit M Shah
  • 72,912
  • 30
  • 168
  • 299
  • 1
    This will be correct in most cases, but not for characters represented by a high and a low surrogate. – jlordo Dec 31 '12 at 17:58
0

There is the way to filter the special characters you need. Just check the ASCII Table

Hope it helps

public class main {

public  static void main(String args[]) {
    String str = args[0];
    String bstr = "";
    String[] codePointAt = new String[str.length()];

    if (str != "") 
    {
        for (int j = 0; j < str.length(); j++) 
        {
            int charactercode=Character.codePointAt(str, j);
            //CHECK on ASCII TABLE THE SPECIAL CHARS YOU NEED
            if(     (charactercode>31 && charactercode<48) ||
                    (charactercode>57 && charactercode<65) ||
                    (charactercode>90 && charactercode<97) ||
                    (charactercode>127)

                )
            {
                codePointAt[ j] ="&"+String.valueOf(charactercode)+";";
            }
            else
            {
                codePointAt[ j] =  String.valueOf( str.charAt(j) );
            }
        }

        for (int j = 0; j < codePointAt.length; j++) 
        {
            System.out.println("CODE "+j+" ->"+ codePointAt[j]);
        }

    }   
 }

}

OUTPUT

call with ("TRY./&asda")

CODE 0 ->T
CODE 1 ->R
CODE 2 ->Y
CODE 3 ->&46;
CODE 4 ->&47;
CODE 5 ->&38;
CODE 6 ->a
CODE 7 ->s
CODE 8 ->d
CODE 9 ->a
-2
short unicode = string.charAt(index);
Android Killer
  • 18,174
  • 13
  • 67
  • 90
  • @Android Killer yeah now its ok but as pointed out by @-jlordo - yeah what about other chars whose value will b >127 – exexzian Dec 31 '12 at 17:36
  • @jlordo ok thanks for letting me correct, i changed it to short. – Android Killer Dec 31 '12 at 17:46
  • Now it will be correct in most cases, but not for characters represented by a high and a low surrogate. – jlordo Dec 31 '12 at 17:58
  • 1
    Why are assigning the char to a short? Char and short are both 16-bit types, but char is unsigned while short is signed. This means that when casting a char to a short, you won't lose any information, but you will get negative numbers instead of positive, which may not be what you would expect. As the VM uses ints internally for short values anyway and ints can directly represent the full range of unsigned 16-bit values, there is no benefit when casting the char to a short compared to when casting it to an int. – Jan B Dec 31 '12 at 18:33
  • @JanB removed the casting , anyway thanks for sharing ur knowledge here.though i knew it but forgot it,thanks for reminding me. – Android Killer Dec 31 '12 at 18:36