22
 printf("%lu \n", sizeof(*"327"));

I always thought that size of a pointer was 8 bytes on a 64 bit system but this call keeps returning 1. Can someone provide an explanation?

haccks
  • 104,019
  • 25
  • 176
  • 264
lordgabbith
  • 321
  • 2
  • 4
  • 30
    `sizeof(*"327")` is `sizeof(char)` since `*` dereferences the first char of your literal string. just try `sizeof(char *)` – Jean-François Fabre Oct 12 '17 at 08:51
  • 2
    String literals are arrays of characters, and you get a pointer to its first character (type `char *`). Now, when you dereference a pointer to `char` what do you get? – Some programmer dude Oct 12 '17 at 08:52
  • 8
    I also recommend you read e.g. [this `printf` (and family) reference](http://en.cppreference.com/w/c/io/fprintf), because `"%lu"` is the wrong format for `sizeof` arguments. – Some programmer dude Oct 12 '17 at 08:58
  • 1
    Can you provide an explanation for the star? Why did you put the star in? – Martin James Oct 12 '17 at 09:26
  • 8
    Sidenote: `sizeof` is no function that could be called. – Gerhardh Oct 12 '17 at 10:34
  • 7
    [you can't call `sizeof`](https://stackoverflow.com/q/1393582/995714) – phuclv Oct 12 '17 at 15:20
  • 4
    You may have been thinking of `sizeof(&"327")`. – user2357112 Oct 12 '17 at 16:49
  • @Someprogrammerdude. The page you referenced does not contain the 'special' formats for pointers, sizeof(), etc. – user3629249 Oct 13 '17 at 03:47
  • `sizeof` is a compile time operator, similar to `return` as it is not a function – user3629249 Oct 13 '17 at 03:49
  • @user3629249 Yes it does. Scroll down a little bit, look at the large table. You will find the `p` conversion specifier for pointers, and for `size_t` you will find it in the `z` argument-type sub-column. – Some programmer dude Oct 13 '17 at 05:37
  • 1
    @user3629249 `sizeof` is an *operator* but `return` is a *statement*. The `sizeof` operator is *mostly* compile-time, but can't be for e.g. [variable-length arrays]https://en.wikipedia.org/wiki/Variable-length_array). And the `return` *statement* is all run-time, since it can't really be executed at compile-time, especially if it's supposed to return a computed value. – Some programmer dude Oct 13 '17 at 05:47

4 Answers4

61

Putting * before a string literal will dereference the literal (as string literal are array of characters and will decay to pointer to its first element in this context). The statement

printf("%zu \n", sizeof(*"327")); 

is equivalent to

printf("%zu \n", sizeof("327"[0]));  

"327"[0] will give the first element of the string literal "327", which is character '3'. Type of "327", after decay, is of char * and after dereferencing it will give a value of type char and ultimately sizeof(char) is 1.

haccks
  • 104,019
  • 25
  • 176
  • 264
  • I haven't programmed in C for about 10 years, but when did sizeof('3') become sizeof(int)? A character literal was always one byte. Have they finally defaulted to using wide characters for Unicode support? – dwilliss Oct 12 '17 at 14:05
  • 10
    @dwilliss; In C, `'3'` an integer constant. As per C standard an integer character constant has type `int`. – haccks Oct 12 '17 at 14:07
  • @haccks so is `'3'` really a different type than '`m`'?! I find it hard to believe so I must be misunderstanding you. – Lorraine Oct 12 '17 at 14:20
  • 3
    @Wilson; No. `'3'` has same type as of `'m'`: both are character constants. But I don't know why standard used "integer character constant". *"An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer."*--C11:6.4.4.4(p10). – haccks Oct 12 '17 at 14:26
  • 6
    @Wilson You misunderstand. All character constants have always been of type `int` in C. Only in C++ is `'3'` (or `'m'`) of type `char`. Don't believe it? `int main() { printf("%zu\n", sizeof 'm');}` – trent Oct 12 '17 at 14:26
  • 1
    I wonder if this is something that changed (or was clarified by) C99. I also remember compiler flags to make chars the same as wchar_t. But that was in the 80's. – dwilliss Oct 12 '17 at 14:49
  • 5
    @dwilliss [it has always been an `int` in C](https://stackoverflow.com/q/2172943/995714) – phuclv Oct 12 '17 at 15:22
  • 6
    It's not really dereferencing to `'3'` (i.e. a character literal). It is dereferencing a string literal (`char*`) to the first element which will be a `char` and hence a single byte (`*(char*) = (char)`). It just happens that the value of the dereferenced string literal is `(char)'3'` because that is the first character in the string. – Tom Carpenter Oct 12 '17 at 17:17
  • 5
    @dwilliss `sizeof('3')` has always been the same as `sizeof(int)`. Likewise, `sizeof(*"3")` has always been the same as `sizeof(char)`. As illogical as it may sound, a character literal is not of type `char`. In fact, *all* literals in C are at least the size of `int`. To get something of type `char`, you need an expression with something other than a literal in it. – Mark Reed Oct 12 '17 at 22:34
  • 2
    @MarkReed the compound literal `(char){'3'}` has size 1 – M.M Oct 13 '17 at 00:51
  • I would count a typecast as "something other than a literal", but fair enough. :) – Mark Reed Oct 13 '17 at 14:57
  • @MarkReed; Interestingly, `(char)` in `(char){'3'}` is not a typecast! He said "compound literal" :) – haccks Oct 13 '17 at 15:03
  • No, it absolutely is. You can replace the `char` inside the parentheses with `short`,`int`, or `long` and get the corresponding size. – Mark Reed Oct 13 '17 at 15:26
  • 1
    @MarkReed; *"A postfix expression that consists of a parenthesized type name followed by a brace-enclosed list of initializers is a compound literal. It provides an unnamed object whose value is given by the initializer list."* - n1570: 6.5.2.5/3. Try to compile these to separately to see the difference: `printf("%d\n", ++(char){'3'});` and `printf("%d\n", ++(char)('3'));`. You will see the difference between a cast operator and compound literal. – haccks Oct 13 '17 at 15:37
18

The statement:

printf("%lu \n", sizeof(*"327"));

actually prints the size of a char, as using * dereferences the first character of string 327. Change it to:

char* str = "327";
printf("%zu \n", sizeof(str));

Note that we need to use %zu here, instead of %lu, because we are printing a size_t value.

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
Marievi
  • 4,951
  • 1
  • 16
  • 33
4

The string literal is an anonymous, static array of chars, which decays to a pointer to its first character -- that is, a pointer value of type char *.

As a result expression like *"abc" is equivalent to *someArrayOfCharName, which in turn is equivalent to *&firstCharInArray which results in firstCharInArray. And sizeof(firstCharInArray) is sizeof(char) which is 1.

trent
  • 25,033
  • 7
  • 51
  • 90
CiaPan
  • 9,381
  • 2
  • 21
  • 35
  • 1
    *Almost* correct. Even while the array is read-only (can't be modified), it's not `const`. So it decays to a plain non-const `char *`. This is one of the things that makes C and C++ different. – Some programmer dude Oct 12 '17 at 09:04
  • 1
    Technically it doesn't decay unless used in a decay context . This is why `sizeof "abcd"` is 5. Applying the dereference operator is a decay context of course – M.M Oct 13 '17 at 00:52
0

Good answer by haccks.

Also, the behaviour of your code is undefined, because you have used the wrong format specifier.

So, use %zu instead of %lu because sizeof() returns size_t and size_t is unsigned.

C11 Standard: §7.21.6.1: Paragraph 9:

If a conversion specification is invalid, the behavior is undefined.225) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
msc
  • 33,420
  • 29
  • 119
  • 214
  • 6
    While this is true, this should be a comment. There is no UB in case `size_t` is of smaller or equal size as `unsigned long`. – Lundin Oct 12 '17 at 12:27
  • 4
    @Lundin No matter what type `size_t` is, either it doesn't promote, or it promotes to `signed int`/`unsigned int`, so the only way you'll get an `unsigned long` passed to `printf` is if `size_t` already is an `unsigned long`. I don't see how you can say, then, that the behaviour is defined if `size_t` is anything else. –  Oct 12 '17 at 13:05
  • @hvd Fine, but this isn't a real-world problem and it doesn't answer the question. In the real world, either both long and size_t are of the same size and there is no problem. Or long is 32 bit while size_t is larger, in which case you might end up reading part of the size_t but you don't invoke any UB by that. – Lundin Oct 12 '17 at 13:11
  • 1
    @Lundin "but you don't invoke any UB by that" -- This answer already explains why you're wrong. The standard explicitly states the behaviour is undefined, and it *has* to be undefined, because there's no sensible way to define it that works for all calling conventions. Agreed that it doesn't answer the question, but it's entirely accurate and would have been worth posting as a comment. –  Oct 12 '17 at 14:49
  • It *might be* undefined; it's a common implementation choice for `size_t` to be a typedef for `unsigned long` – M.M Oct 13 '17 at 00:53