24

In my code:

 char *str[] = {"forgs", "do", "not", "die"};
 printf("%d %d", sizeof(str), sizeof(str[0]));  

I'm getting the output as 12 2, so my doubts are:

  1. Why is there a difference?
  2. Both str and str[0] are char pointers, right?
Grijesh Chauhan
  • 57,103
  • 20
  • 141
  • 208
  • 6
    No, not right at all. `str` is an array, `str[0]` an array element. – Kerrek SB Jul 10 '13 at 07:20
  • the output ? or any syntax?? – Nithish Inpursuit Ofhappiness Jul 10 '13 at 07:21
  • 6
    Are you quite sure of the values 12 and 2 you say you are getting? That would be an unusual architecture. By the way, `%d` is the wrong format to print a `size_t` (the type of `sizeof(…)`). One solution here would be to ensure there is no misunderstanding with `(int)sizeof(str)` and `(int)sizeof(str[0])`. – Pascal Cuoq Jul 10 '13 at 07:24
  • 1
    @NithishInpursuitOfhappiness Read: [What does sizeof(&arr) returns?](http://stackoverflow.com/questions/15177420/what-does-sizeofarr-returns/15177499#15177499) – Grijesh Chauhan Jul 10 '13 at 07:24
  • 2
    The output you say you are getting is unrealistic. It can't be 12 and 2. – AnT stands with Russia Jul 10 '13 at 07:32
  • 9
    On whatever architecture you are, the first value should be 4 times the second one. On a 32 bit machine, you should get `16 4`, on a 64 bit one `32 8`. On a very old one or on an embedded system, you might even get `8 2`, but never `12 2` as the array contains 4 element of the same size – glglgl Jul 10 '13 at 07:33
  • @glglgl There have been architectures where pointers to different types have different sizes. Though the OP probably is not using one of these. – Pascal Cuoq Jul 10 '13 at 07:35
  • 3
    @PascalCuoq Even then this wouldn't apply here: the array should have `n` times the size of a `char *`, whatever this is, with `n` being the number of array elements. – glglgl Jul 10 '13 at 07:37
  • 1
    `sizeof(str)=size of str array = (number of element)*sizeof(char*)` `sizeof(str[0]) = size of first element = sizeof(char*)` – Dayal rai Jul 10 '13 at 07:38
  • @Dayalrai Right. As I said. – glglgl Jul 10 '13 at 07:42
  • @NithishInpursuitOfhappiness One more good question can be how they stored in memory. Please check [help-center](http://stackoverflow.com/posts/17564608/revisions). – Grijesh Chauhan Jul 10 '13 at 08:08
  • Check [this](http://stackoverflow.com/questions/1641957/is-array-name-a-pointer-in-c) too – Suvarna Pattayil Jul 30 '13 at 13:25
  • @glglgl *`There have been architectures where pointers to different types have different sizes`* **?** reference? – Grijesh Chauhan Sep 14 '13 at 06:33
  • @GrijeshChauhan See [this answer](http://stackoverflow.com/a/1539196/296974), referring to the [C FAQ](http://c-faq.com/null/machexamp.html): "Some 64-bit Cray machines represent `int *` in the lower 48 bits of a word; `char *` additionally uses some of the upper 16 bits to indicate a byte address within a word." – glglgl Sep 14 '13 at 06:43
  • BTW, your question should address @PascalCuoq, as you quote from him. – glglgl Sep 14 '13 at 06:48
  • why are 20 people following this question – M.M May 22 '18 at 03:22
  • @PascalCuoq: I guess this could be a 16 bit architecture, but I can't explain how he got `printf` to emit values in base 6. – jxh May 22 '18 at 06:01

4 Answers4

66

Through the question is already answered and accepted, but I am adding some more description (also answering the original question) that I guess will be helpful for new users. (as I searched, this description is not explained anywhere else (at-least on stackoverflow) hence I am adding now.

First read: sizeof Operator

6.5.3.4 The sizeof operator, 1125:
When you apply the sizeof operator to an array type, the result is the total number of bytes in the array.

According to this when sizeof is applied to the name of a static array identifier (not allocated through malloc), the result is the size in bytes of the whole array rather then just address. This is one of the few exceptions to the rule that the name of an array is converted/decay to a pointer to the first element of the array, and it is possible just because the actual array size is fixed and known at compile time, when sizeof operator evaluates.

To understand it better consider the code below:

#include<stdio.h>
int main(){
 char a1[6],       // One dimensional
     a2[7][6],     // Two dimensional 
     a3[5][7][6];  // Three dimensional

 printf(" sizeof(a1)   : %lu \n", sizeof(a1));
 printf(" sizeof(a2)   : %lu \n", sizeof(a2));
 printf(" sizeof(a3)   : %lu \n", sizeof(a3));
 printf(" Char         : %lu \n", sizeof(char));
 printf(" Char[6]      : %lu \n", sizeof(char[6]));
 printf(" Char[5][7]   : %lu \n", sizeof(char[7][6]));
 printf(" Char[5][7][6]: %lu \n", sizeof(char[5][7][6]));

 return 1;
} 

Its output:

 sizeof(a1)   : 6 
 sizeof(a2)   : 42 
 sizeof(a3)   : 210 
 Char         : 1 
 Char[5]      : 6 
 Char[5][7]   : 42 
 Char[5][7][6]: 210 

Check above working at @codepad, notice size of char is one byte, it you replace char with int in above program then every output will be multiplied by sizeof(int) on your machine.

Difference between char* str[] and char str[][] and how both are stored in memory

Declaration-1: char *str[] = {"forgs", "do", "not", "die"};

In this declaration str[] is an array of pointers to char. Every index str[i] points to first char of strings in {"forgs", "do", "not", "die"};.
Logically str should be arranged in memory in following way:

Array Variable:                Constant Strings:
---------------                -----------------

         str:                       201   202   203   204  205   206
        +--------+                +-----+-----+-----+-----+-----+-----+
 343    |        |= *(str + 0)    | 'f' | 'o' | 'r' | 'g' | 's' | '\0'|
        | str[0] |-------|        +-----+-----+-----+-----+-----+-----+
        | 201    |       +-----------▲
        +--------+                  502   503  504
        |        |                +-----+-----+-----+
 347    | str[1] |= *(str + 1)    | 'd' | 'o' | '\0'|
        | 502    |-------|        +-----+-----+-----+
        +--------+       +-----------▲
        |        |                  43    44    45    46
 351    | 43     |                +-----+-----+-----+-----+
        | str[2] |= *(str + 2)    | 'n' | 'o' | 't' | '\0'|
        |        |-------|        +-----+-----+-----+-----+
        +--------+       +-----------▲
 355    |        |
        | 9002   |                 9002  9003   9004 9005
        | str[3] |                +-----+-----+-----+-----+
        |        |= *(str + 3)    | 'd' | 'i' | 'e' | '\0'|
        +--------+       |        +-----+-----+-----+-----+
                         +-----------▲


Diagram: shows that str[i] Points to first char of each constant string literal. 
Memory address values are assumption.

Note: str[] is stored in continue memory allocations and every string is stored in memory at random address (not in continue space).

[ANSWER]

According to Codepad following code:

int main(int argc, char **argv){
    char *str[] = {"forgs", "do", "not", "die"};
    printf("sizeof(str): %lu,  sizeof(str[0]): %lu\n", 
            sizeof(str), 
            sizeof(str[0])
    );  
    return 0;
}

Output:

sizeof(str): 16,  sizeof(str[0]): 4
  • In this code str is an array for 4 char-addresses, where each char* is size 4 bytes, so according to above quote total size of array is 4 * sizeof(char*) = 16 bytes.

  • Datatype of str is char*[4].

  • str[0] is nothing but pointer to char, so its four bytes. Datetype of str[i] is char*.

(note: in some system address can be 2-byte or 8-bytes)

Regarding output one should also read glglgl's comment to the question:

On whatever architecture you are, the first value should be 4 times the second one. On a 32 bit machine, you should get 16 4, on a 64 bit one 32 8. On a very old one or on an embedded system, you might even get 8 2, but never 12 2 as the array contains 4 element of the same size

Additional points:

  • Because each str[i] points to a char* (and string) is variable, str[i] can be assigned a new string's address for example: str[i] = "yournewname"; is valid for i = 0 to < 4.

One more important point to notice:

  • In our above example str[i] pointing to constant string literal that can't be modified; hence str[i][j] = 'A' is invalid (we can't write on read only memory) and doing this will be a runtime error.
    But suppose if str[i] points to a simple char array then str[i][j] = 'A' can be a valid expression.
    Consider following code:

      char a[] = "Hello"; // a[] is simple array
      char *str[] = {"forgs", "do", "not", "die"};
      //str[0][4] = 'A'; // is error because writing on read only memory
      str[0] = a;
      str[0][5] = 'A'; // is perfectly valid because str[0] 
                       // points to an array (that is not constant)
    

Check here working code: Codepad

Declaration-2: char str[][6] = {"forgs", "do", "not", "die"};:

Here str is a two-dimensional array of chars (where each row is equal in size) of size 4 * 6. (remember here you have to give column value in declaration of str explicitly, but row is 4 because of number of strings are 4)
In memory str[][] will be something like below in diagram:

                    str
                    +---201---202---203---204---205---206--+
201                 | +-----+-----+-----+-----+-----+-----+|   
str[0] = *(str + 0)--►| 'f' | 'o' | 'r' | 'g' | 's' | '\0'||
207                 | +-----+-----+-----+-----+-----+-----+|
str[1] = *(str + 1)--►| 'd' | 'o' | '\0'| '\0'| '\0'| '\0'||
213                 | +-----+-----+-----+-----+-----+-----+|
str[2] = *(str + 2)--►| 'n' | 'o' | 't' | '\0'| '\0'| '\0'||
219                 | +-----+-----+-----+-----+-----+-----+|
str[3] = *(str + 3)--►| 'd' | 'i' | 'e' | '\0'| '\0'| '\0'||
                    | +-----+-----+-----+-----+-----+-----+|
                    +--------------------------------------+
  In Diagram:                                 
  str[i] = *(str + i) = points to a complete i-row of size = 6 chars. 
  str[i] is an array of 6 chars.

This arrangement of 2D array in memory is called Row-Major: A multidimensional array in linear memory is organized such that rows are stored one after the other. It is the approach used by the C programming language.

Notice differences in both diagrams.

  • In second case, complete two-dimensional char array is allocated in continue memory.
  • For any i = 0 to 2, str[i] and str[i + 1] value is different by 6 bytes (that is equals to length of one row).
  • Double boundary line in this diagram means str represents complete 6 * 4 = 24 chars.

Now consider similar code you posted in your question for 2-dimensional char array, check at Codepad:

int main(int argc, char **argv){
    char str[][6] = {"forgs", "do", "not", "die"};
    printf("sizeof(str): %lu,  sizeof(str[0]): %lu\n", 
            sizeof(str), 
            sizeof(str[0])
    );
    return 0;
}

Output:

sizeof(str): 24,  sizeof(str[0]): 6

According to the sizeof operator's treatment with array, On application of 2-d array size of should return the entire array size that is 24 bytes.

  • As we know, sizeof operator returns the size of the entire array on applying array name. So for sizeof(str) it returns = 24 that is size of complete 2D char array consists of 24 chars (6-cols* 4-rows).

  • In this declaration type of str is char[4][6].

  • One more interesting point is str[i] represents an array chats and it's type is char[6]. And sizeof(str[0]) is complete array's size = 6 (row length).

Additional points:

  • In second declaration str[i][j] is not constant, and its content can be changes e.g. str[i][j] = 'A' is a valid operation.

  • str[i] is name of char array of type char[6] is a constant and assignment to str[i] e.g. str[i] = "newstring" is illegal operation (infect it will be compilation-time error).

One more important difference between two declarations:

In Declaration-1: char *str[] = {"forgs", "do", "not", "die"};, type of &str is char*(*)[4], its address of array of char pointers.

In Declaration-2: char str[][6] = {"forgs", "do", "not", "die"};, type of &str is char(*)[4][6], its address of 2-D char array of 4 rows and 6 cols.

If one wants to read similar description for 1-D array: What does sizeof(&array) return?

Community
  • 1
  • 1
Grijesh Chauhan
  • 57,103
  • 20
  • 141
  • 208
  • 1
    *... the result is the size of the entire array rather than the size of the pointer represented by the array identifier.* If the identifier represents an array then it doesn't represent any pointer. – P.P Jul 18 '13 at 06:18
  • 2
    Your quote says "...rather the size of pointer represented by array". An array name doesn't represent a pointer. It can conveniently decay into a pointer but that doesn't make it a pointer. – P.P Jul 18 '13 at 06:26
  • You don't really need to quote and explain it yourself. If you want to quote one then look at 6.5.3.4 in C11 which states `...When applied to an operand that has array type, the result is the total number of bytes in the array` and may be the definition of sizeof in the same section as well. – P.P Jul 18 '13 at 07:05
  • 1
    A typo here. The addresses would be (decimal values) `343, 351, 359, 367` instead of `343, 344, 345, 346`. for 32-bit architecture – noufal Sep 04 '13 at 11:41
  • @noufal Thanks corrected the typo you noticed, but I am assuming 4 byte `char*` according to codepade I linked the code – Grijesh Chauhan Sep 04 '13 at 19:14
8

In most cases, an array name will decay to the value of the address of its first element, and with type being the same as a pointer to the element type. So, you would expect a bare str to have the value equal to &str[0] with type pointer to pointer to char.

However, this is not the case for sizeof. In this case, the array name maintains its type for sizeof, which would be array of 4 pointer to char.

The return type of sizeof is a size_t. If you have a C99 compiler, you can use %zu in the format string to print the value returned by sizeof.

jxh
  • 69,070
  • 8
  • 110
  • 193
  • But that doesn't explain the `12 2`... nevertheless +1 – glglgl Jul 10 '13 at 07:35
  • 3
    @glglgl The “12 2” is probably explained by `size_t` being 64-bit and invoking undefined behavior when passed to `printf` with a `%d` format, as per my comment. – Pascal Cuoq Jul 10 '13 at 07:37
  • @PascalCuoq This might be the case, but then I'd rather expect a `0` somewhere, no matter if little or big endian... But one never knows what strange things the one or other platform makes. – glglgl Jul 10 '13 at 07:39
  • 2
    If your implementation supports it, use `"%zu"` to print a `size_t` value. If it doesn't (Microsoft likely doesn't), use `"%lu"` and convert the value to `unsigned long`. Or, if you're lazy and the values are known to be small, use `"%d"` and convert to `int`. – Keith Thompson Jul 15 '13 at 20:16
  • The unconventional values may also be the result of not including a proper prototype for `printf` in the program, and so the system applied the variable argument macros to something random. – jxh May 22 '18 at 05:39
6

It's 16 4 on my computer, and I can explain this: str is an array of char*, so sizeof(str)==sizeof(char*)*4

I don't know why you get 12 2 though.

Grijesh Chauhan
  • 57,103
  • 20
  • 141
  • 208
Manas
  • 598
  • 3
  • 14
3

The two pointers are different. str is an array of char pointers, in your example is a (char*[4]), and the str[0] is a char pointer.

The first sizeof returns the size of the four char pointers that contains, and the second returns the sizeof of the char*.
In my tests the results are:

sizeof(str[0]) = 4   // = sizeof(char*)

sizeof(str) = 16  
            = sizeof(str[0]) + sizeof(str[1]) + sizeof(str[2]) + sizeof(str[3])
            = 4 * sizeof(char*)  
            = 4 * 4
            = 16
Grijesh Chauhan
  • 57,103
  • 20
  • 141
  • 208
superarce
  • 403
  • 6
  • 14