Integer value of character to store in array

Question

I was wondering which of these is better val or val2 to get the integer value to map characters to integer ?

for(Character c : s.toCharArray()){
    int val = c -'a';
    int val2 = Character.getNumericValue(c) - Character.getNumericValue('a');
    System.out.println(val + " " + val2);
}

It's a completely opinion-based question, but I'd definitely go with the first way. — shmosel, Jan 04 '17 at 02:22
They do different things. So it's not opinion-based. It's a question of which you need (like asking whether a motorbike is better than an eggbeater - do you want to travel somewhere, or beat eggs?) — Dawood ibn Kareem, Jan 04 '17 at 02:40

score 2 · Answer 1 · edited May 23 '17 at 12:26

I believe you need to know the difference between ASCII and Unicode first.

ASCII defines 128 characters, which map to the numbers 0–127. Unicode defines (less than) 2²¹ characters, which, similarly, map to numbers 0–2²¹ (though not all numbers are currently assigned, and some are reserved). So, in short, Unicode is a superset of ASCII.

Reference: What's the difference between ASCII and Unicode?

Example

Using ASCII value and the value represented by a Unicode character is not same. For example.

System.out.println((int)'A'); // prints 65, ASCII value
System.out.println(Character.getNumericValue('A')); // prints 10 represents Unicode character 'A'

Now, if we look into your example, the difference will be clear.

String s = "Wasi";
for (Character c : s.toCharArray()) {
    int val = c - 'a';
    int val2 = Character.getNumericValue(c) - Character.getNumericValue('a');
    System.out.println(val + " " + val2);
}

Output

So, before judging which one is better, you should think which one actually you need.

One more important thing to note, Character.getNumericValue() doesn't consider case (lower or upper) of a character.

For example, Character.getNumericValue('A') and Character.getNumericValue('a'), both returns the value 10.

"Unicode defines (less than) **221** characters" ? Say what? It defines thousands and thousands of characters. — Erwin Bolwidt, Jan 04 '17 at 02:36
I am sorry, i have edited my answer to reflect correct information. please have a look at the reference answer in SO for more clear idea. — Wasi Ahmad, Jan 04 '17 at 02:40
I feel that ASCII vs Unicode has nothing to do with this. It's more about how letters are converted into numbers by the `getNumericValue` method. — Dawood ibn Kareem, Jan 04 '17 at 03:41

score 2 · Answer 2 · answered Jan 04 '17 at 03:39

The important differences are

case sensitivity,
behaviour if c is not a letter or number.

So val = c - 'a' is case sensitive, and it will also give reasonable results if c is not a letter. On the other hand, val2 = Character.getNumericValue(c) - Character.getNumericValue('a') only gives sensible results for a narrow range of values of c, but it's case insensitive.

For example,

Character.getNumericValue('B') - Character.getNumericValue('a') is 1, because upper and lower case make no difference.
'B' - 'a' is -31. Just because.

If you want the best of both worlds - applicability to a wide range of inputs, but also case insensitivity, you could write

val3 = Character.toLowerCase(c) - 'a';

Maruf Hossain · Answer 3 · 2017-01-24T13:50:45.183

0

I think if you want to get only alphabet position like,

a - 0
b - 1
.....
.....
z - 25

then use:

 int val2 = Character.getNumericValue(c) - Character.getNumericValue('a');

because it does not consider upper or lower case.

On the other hand, they are totally different things. If you look at 'A' ASCII value 65 and 'a' ASCII value 97 but Numeric value for both of them is same 10.

edited Jan 24 '17 at 13:50

answered Jan 24 '17 at 13:42

Maruf Hossain

39
6

Integer value of character to store in array

3 Answers3