344

What is the size of character in C and C++ ? As far as I know the size of char is 1 byte in both C and C++.

In C:

#include <stdio.h>
int main()
{
    printf("Size of char : %d\n", sizeof(char));
    return 0;
}

In C++:

#include <iostream>
int main()
{
    std::cout << "Size of char : " << sizeof(char) << "\n";
    return 0;
}

No surprises, both of them gives the output : Size of char : 1

Now we know that characters are represented as 'a','b','c','|',... So I just modified the above codes to these:

In C:

#include <stdio.h>
int main()
{
    char a = 'a';
    printf("Size of char : %d\n", sizeof(a));
    printf("Size of char : %d\n", sizeof('a'));
    return 0;
}

Output:

Size of char : 1
Size of char : 4

In C++:

#include <iostream>
int main()
{
    char a = 'a';
    std::cout << "Size of char : " << sizeof(a) << "\n";
    std::cout << "Size of char : " << sizeof('a') << "\n";
    return 0;
}

Output:

Size of char : 1
Size of char : 1

Why the sizeof('a') returns different values in C and C++?

Aykhan Hagverdili
  • 28,141
  • 6
  • 41
  • 93
whacko__Cracko
  • 6,592
  • 8
  • 33
  • 35
  • 9
    The `"%|"` format requires an `int` argument (or something that promotes to `int`). `sizeof` yields a result of type `size_t`. Either convert to `int` using a cast or, if your implementation supports it, use `"%zu"`. – Keith Thompson Nov 09 '11 at 19:55

4 Answers4

401

In C, the type of a character constant like 'a' is actually an int, with size of 4 (or some other implementation-dependent value). In C++, the type is char, with size of 1. This is one of many small differences between the two languages.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • Okay, Can you please specify the standard reference ? :) – whacko__Cracko Jan 31 '10 at 19:20
  • This is pretty much a generic answer given the above code, :P – rmn Jan 31 '10 at 19:21
  • 13
    In the C++ Standard it's section 2.13.2/1, in C 6.4.4.4, at least in the doc I've got. –  Jan 31 '10 at 19:24
  • 16
    +1 (Except that, while the "size of 4" obviously applies to nthrgeek's platform, it doesn't necessarily apply to all platforms.) – sbi Jan 31 '10 at 19:24
  • 37
    @nthrgeek: I'm too lazy to quote both standards, but the C++ standard has an appendix dedicated to incompatibilities with C. Under Appendix C.1.1, it mentions that "Type of character literal is changed from `int` to `char`, which explains the behavior. :) – jalf Jan 31 '10 at 19:28
  • 1
    It makes sense that C and C++ would have this difference. C++ is much more strongly typed than C. – Omnifarious Jan 31 '10 at 19:41
  • 4
    @nthrgeek: §6.4.4.4, paragraph 10: "An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer." – Stephen Canon Jan 31 '10 at 19:41
  • @Omnifarious: It's especially needed in C++ for overloading: `void foo(int); void foo(char);` That's not an issue in C. – sbi Jan 31 '10 at 19:46
  • 7
    @nthrgeek: You should not be asking for a standard reference unless you are having an argument about a specific point and you want to understand why the other person has a different opinion. If everybody agrees just accept it. You (as a developer) should be quite intelligent enough to quickly find common answer like this all by yourself. – Martin York Feb 01 '10 at 05:07
  • What about `sizeof` and types of unicode characters? `sizeof('Ç') == 4` and `std::is_same() == true` is what I get from g++ while in UTF-8 (which is what I use) Ç should take only two bytes. – sasha.sochka Mar 11 '15 at 13:19
  • 2
    The answer can be improved. sizeof(char)==1 in C++, and 1 means 1 byte in C++. 1 byte does not necessary mean 8 bits though. I know that the answer is already old, but it would be great to add this: **[intro.memory]** _The fundamental storage unit in the C++ memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set (2.3) and the eight-bit code units of the Unicode UTF-8 encoding form and is composed of a contiguous sequence of bits, the number of which is implementation defined_ – Sergey.quixoticaxis.Ivanov Jul 13 '17 at 21:15
32

As Paul stated, it's because 'a' is an int in C but a char in C++.

I cover that specific difference between C and C++ in something I wrote a few years ago, at: http://david.tribble.com/text/cdiffs.htm

David R Tribble
  • 11,918
  • 5
  • 42
  • 52
23

In C the type of character literals are int and char in C++. This is in C++ required to support function overloading. See this example:

void foo(char c)
{
    puts("char");
}
void foo(int i)
{
    puts("int");
}
int main()
{
    foo('i');
    return 0;
}

Output:

char
minmaxavg
  • 686
  • 6
  • 21
Kite
  • 651
  • 6
  • 16
8

In C language, character literal is not a char type. C considers character literal as integer. So, there is no difference between sizeof('a') and sizeof(1).

So, the sizeof character literal is equal to sizeof integer in C.

In C++ language, character literal is type of char. The cppreference say's:

1) narrow character literal or ordinary character literal, e.g. 'a' or '\n' or '\13'. Such literal has type char and the value equal to the representation of c-char in the execution character set. If c-char is not representable as a single byte in the execution character set, the literal has type int and implementation-defined value.

So, in C++ character literal is a type of char. so, size of character literal in C++ is one byte.

Alos, In your programs, you have used wrong format specifier for sizeof operator.

C11 §7.21.6.1 (P9) :

If a conversion specification is invalid, the behavior is undefined.275) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

So, you should use %zu format specifier instead of %d, otherwise it is undefined behaviour in C.

msc
  • 33,420
  • 29
  • 119
  • 214
  • `%zu` is not supported on many platforms, but better portability, use `(int)sizeof(char)` and format `%d` – chqrlie Nov 01 '17 at 11:16
  • The value of character literals is not necessarily the corresponding ASCII code. It depends on the source and execution character sets and whether the `char` type is signed or unsigned by default. – chqrlie Nov 01 '17 at 11:18