0

The following text is displayed on page 135 of "C in a Nutshell (2nd Edition)."

#include <stddef.h>              // Definition of the type wchar_t
/* ... */
wchar_t dinner[] = L"chop suey"; // String length: 10;
                                 // array length: 11;
                                 // array size: 11 * sizeof(wchar_t)

In the above example, I would think "chop suey" is the same as 'c', 'h', 'o', 'p', ' ', 's', 'u', 'e', 'y', '\0'. That's 10 elements in the array.

My question is: Why is the "array length" different from the "String length" in this example? Where is this length of 11 coming from? Is there something special about the wchar_t type that is causing this?

Joshua Schlichting
  • 3,110
  • 6
  • 28
  • 54
  • `L"chop suey"` is `L'c'`, `L'h'`, `L'o'`, `L'p'`, `L' '`, `L's'`, `L'u'`, `L'e'`, `L'y'`, `L'\0'`. That took some time to type... – DeiDei Aug 10 '19 at 14:38
  • A C string is as long as the number of characters between the beginning of the string and the terminating null character while the array will store the `\0`. That is the difference in +1 – Tony Tannous Aug 10 '19 at 14:42
  • @melpomene wasn't aware [char is not guaranteed to be 8 bits](https://stackoverflow.com/questions/881894/is-char-guaranteed-to-be-exactly-8-bit-long). – Tony Tannous Aug 10 '19 at 14:45
  • @Tony But that's the problem. That get's us to 10, I want to know, where is this 11 coming from? I don't fully understand how we got to 11 just because the type is larger than char in memory. – Joshua Schlichting Aug 10 '19 at 14:45
  • @Tony Your own link says `wchar_t` is usually a 32-bit type. (Also, that was for C++, where `wchar_t` is a built-in type.) – melpomene Aug 10 '19 at 14:46
  • @melpomene has the correct answer. I went back into the book and found this, "The length of a string is considered to be the number of characters excluding the terminating null character." This was on page 134. This isn't the first time I've encountered confusing typo's in this book. Overall, it has still been a good read. – Joshua Schlichting Aug 10 '19 at 14:57

2 Answers2

3

That looks like an off-by-one error. Most likely someone just miscounted the characters.

chop suey is 9 characters (that's the length of the string); the array has size 10 because it needs to store the NUL terminator that marks the end of the string.

melpomene
  • 84,125
  • 8
  • 85
  • 148
  • 1
    It is important to distinguish between "size in bytes" and "size as number of elements" so as to avoid confusion. – machine_1 Aug 10 '19 at 14:41
  • This is the correct answer. I fell back to page 134 which states: "The length of a string is considered to be the number of characters excluding the terminating null character." Thanks, @melpomene! – Joshua Schlichting Aug 10 '19 at 14:55
3

The correct answer is the following

#include <stdio.h>
#include <wchar.h>

int main(void) 
{
    wchar_t dinner[] = L"chop suey";

    printf( "sizeof( wchar_t ) = %zu\n", sizeof( wchar_t ) );
    printf( "wcslen( dinner ) = %zu, sizeof( dinner ) = %zu\n", wcslen( dinner ), sizeof( dinner ) );

    return 0;
}

The program output is

sizeof( wchar_t ) = 4
wcslen( dinner ) = 9, sizeof( dinner ) = 40

You can run the program yourself using your compiler.

The function wcslen counts the number of wchar_t symbols until the terminating zero is encountered. The operator sizeof returns the number of bytes (including the terminating zero) occupied by the array dinner.

Actually the string length is 9 that is the terminating zero is excluded from the string length. With the terminating zero there are 10 symbols of the type wchar_t in the array.

The definition of the type wchar_t is implementation defined.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335