0

I've found that when I typecast a character pointer to in C++17, I get a some sort of mapping instead of the actual number I would expect. Example below

#include <iostream>
int main () {
    char c;
    c = '1';
    std::cout << int(c) << std::endl;
}

When I build and run this with

g++ file.cpp -o output
./output

I get

49

'0' maps to 48 and '2' maps to 50 and so on. Why? How do I avoid this?

2 Answers2

2

What you are actually getting is the ASCII code for these characters because they are stored in memory as integers known as ASCII codes.

To convert a char variable to its decimal value instead you can use this:

int value = c - '0';

What this does is that it takes the integer value of c which is 48 for the '0' for example and subtracts the integer value of '0' from it which is also 48, resulting in 0.

Below is the full table for decimal digits and their corresponding ASCII values:

0 -> 48
1 -> 49
2 -> 50
3 -> 51
4 -> 52
5 -> 53
6 -> 54
7 -> 55
8 -> 56
9 -> 57

And when subtracting the '0' from them the result is their corresponding decimal values:

0 -> 48 - 48 = 0
1 -> 49 - 48 = 1
2 -> 50 - 48 = 2
3 -> 51 - 48 = 3
4 -> 52 - 48 = 4
5 -> 53 - 48 = 5
6 -> 54 - 48 = 6
7 -> 55 - 48 = 7
8 -> 56 - 48 = 8
9 -> 57 - 48 = 9
Bemwa Malak
  • 1,182
  • 1
  • 5
  • 18
  • 2
    Note, however, that not all systems use ASCII, although it's by far the most common. In C and C++, though, for all character encodings, the values of `'0' .. '9'` are required to be contiguous and increasing, so `ch - '0'` and `digit + '0'` will always work. – Pete Becker Jan 11 '22 at 13:10
  • Thanks for your information! have a nice day. – Bemwa Malak Jan 11 '22 at 13:12
  • @PeteBecker Is ASCII still by far the most common? I find that highly unlikely: Linux has been using UTF-8 as the default encoding for a _very_ long time, although it’s of course true that the first 127 values have an identical mapping in UTF-8 and ASCII. (And yes, a single `char` cannot represent a UTF-8 code point but the underlying value would still encoded using UTF-8, not ASCII). I wouldn’t go as far as saying ASCII is irrelevant, but it’s certainly a red herring here. – Konrad Rudolph Jan 11 '22 at 13:22
  • 2
    @KonradRudolph -- do you really want to explain all that to beginners? "No, the ASCII codes that you're seeing aren't ASCII; they're exactly the same values but they're UTF-8"? At an introductory level that's a distinction without a difference. – Pete Becker Jan 11 '22 at 13:26
  • @PeteBecker No, I *don’t* want to explain this to beginners. I’d rather not mention ASCII at all, since it’s at the same time wrong, irrelevant and grossly misleading. For the purpose of this question it’s entirely sufficient to mention that the values are encoded *somehow*. Why include irrelevant, wrong information? – Konrad Rudolph Jan 11 '22 at 14:19
0

I've found that when I typecast a character pointer to in C++17, I get a some sort of mapping instead of the actual number

In the provided code there are no pointers. There is an object of the type char.

char c;
c = '1';

This statement

std::cout << int(c) << std::endl;

outputs the internal representation of the character '1' as an integer. In ASCII the internal representation of the character '1' is decimal 49. In EBCDIC the internal representation of the character '1' is 241.

If you want to get the integer number 1 you could write for example

std::cout << c - '0' << std::endl;

Or you could output the character as character and the output on the console will be the same that is 1

std::cout << c << std::endl;
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335