60

Consider following code:

char str[] = "Hello\0";

What is the length of str array, and with how much 0s it is ending?

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
UmmaGumma
  • 5,633
  • 1
  • 31
  • 45
  • @Ashot Martirosyan: Do you need to know about C as well, or are you only interested in the answer for C++? – CB Bailey Jan 17 '11 at 09:36
  • 2
    @Charles Bailey I am mainly using C++, but of course I want to know is there any difference between C and C++ . That's why I add C++ tag. – UmmaGumma Jan 17 '11 at 09:38
  • @UmmaGumma If you just tag it with C, then people who only know C can answer. If you tag it with both C and C++, you limit the set of people who can reply to only those who understand the subtle differences between the two languages. Tagging something with both languages should only be done if the question really does require that level of knowledge and expertise -- an understanding of the subtle differences between the two languages. – David Schwartz Nov 08 '18 at 20:29

7 Answers7

106

sizeof str is 7 - five bytes for the "Hello" text, plus the explicit NUL terminator, plus the implicit NUL terminator.

strlen(str) is 5 - the five "Hello" bytes only.

The key here is that the implicit nul terminator is always added - even if the string literal just happens to end with \0. Of course, strlen just stops at the first \0 - it can't tell the difference.

There is one exception to the implicit NUL terminator rule - if you explicitly specify the array size, the string will be truncated to fit:

char str[6] = "Hello\0"; // strlen(str) = 5, sizeof(str) = 6 (with one NUL)
char str[7] = "Hello\0"; // strlen(str) = 5, sizeof(str) = 7 (with two NULs)
char str[8] = "Hello\0"; // strlen(str) = 5, sizeof(str) = 8 (with three NULs per C99 6.7.8.21)

This is, however, rarely useful, and prone to miscalculating the string length and ending up with an unterminated string. It is also forbidden in C++.

bdonlan
  • 224,562
  • 31
  • 268
  • 324
  • 6
    You should add that this kind of truncation is only valid in C, not in C++. – fredoverflow Jan 17 '11 at 09:15
  • 3
    Your `char [8]` example seems wrong. If the OP had used `char str[8] = { 'H', 'e', 'l', 'l', 'o', '\0', '\0' }; the remaining character's value would _not_ be undefined, it would be zero (so that you can sanely initialize, e.g. `int arr[100] = { 0 }` to be all zeroes). I don't see why it would be any different for `"Hello\0"` than it is for the long form, unless the standard explicitly makes an exception for this case (which would seem very strange to me.) – Chris Lutz Jan 17 '11 at 09:19
  • @Chris, yes, I updated it presumably while you were writing your response :) – bdonlan Jan 17 '11 at 09:50
  • 1
    Incidentally, the paragraph in question: If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration. – bdonlan Jan 17 '11 at 09:50
  • 1
    FYI, null character (also null terminator), abbreviated NUL, therefore the term "NUL terminator" is little bit confusing for me. – V-SHY Feb 17 '15 at 03:58
  • @V-SHY Then I suspect you don't understand what words and definitions are. A word is not a literal substitute for its definition. A word is a conceptual unit whose definition can help point to the unitary concept the word represents. That it might be convenient to define the word "NUL" using the word "terminator" doesn't make "NUL terminator" redundant. The same is even true of abbreviations, and that's why there's nothing wrong with saying "PIN number" or "scuba gear". – David Schwartz Oct 28 '15 at 07:25
12

The length of the array is 7, the NUL character \0 still counts as a character and the string is still terminated with an implicit \0

See this link to see a working example

Note that had you declared str as char str[6]= "Hello\0"; the length would be 6 because the implicit NUL is only added if it can fit (which it can't in this example.)

§ 6.7.8/p14
An array of character type may be initialized by a character string literal, optionally enclosed in braces. Sucessive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

Examples

char str[] = "Hello\0"; /* sizeof == 7, Explicit + Implicit NUL */
char str[5]= "Hello\0"; /* sizeof == 5, str is "Hello" with no NUL (no longer a C-string, just an array of char). This may trigger compiler warning */
char str[6]= "Hello\0"; /* sizeof == 6, Explicit NUL only */
char str[7]= "Hello\0"; /* sizeof == 7, Explicit + Implicit NUL */
char str[8]= "Hello\0"; /* sizeof == 8, Explicit + two Implicit NUL */
SiegeX
  • 135,741
  • 24
  • 144
  • 154
  • The length of the "string" (as C functions view strings) is 5. The `sizeof` operator returns 7. – Chris Lutz Jan 17 '11 at 09:07
  • 2
    @ChrisLutz: The question asked was "What is the length of str _array_ " (my emphasis) so this answer is correct. – CB Bailey Jan 17 '11 at 09:11
  • 1
    @Chris Ok, I'll concede that the `C` vernacular has different meanings for *length* and *size* with respect to strings. In my answer I was referring to the latter. – SiegeX Jan 17 '11 at 09:12
7

Specifically, I want to mention one situation, by which you may confuse.

What is the difference between "\0" and ""?

The answer is that "\0" represents in array is {0 0} and "" is {0}.

Because "\0" is still a string literal and it will also add "\0" at the end of it. And "" is empty but also add "\0".

Understanding of this will help you understand "\0" deeply.

chris
  • 2,761
  • 17
  • 24
5

Banging my usual drum solo of JUST TRY IT, here's how you can answer questions like that in the future:

$ cat junk.c
#include <stdio.h>

char* string = "Hello\0";

int main(int argv, char** argc)
{
    printf("-->%s<--\n", string);
}
$ gcc -S junk.c
$ cat junk.s

... eliding the unnecessary parts ...

.LC0:
    .string "Hello"
    .string ""

...

.LC1:
    .string "-->%s<--\n"

...

Note here how the string I used for printf is just "-->%s<---\n" while the global string is in two parts: "Hello" and "". The GNU assembler also terminates strings with an implicit NUL character, so the fact that the first string (.LC0) is in those two parts indicates that there are two NULs. The string is thus 7 bytes long. Generally if you really want to know what your compiler is doing with a certain hunk of code, isolate it in a dummy example like this and see what it's doing using -S (for GNU -- MSVC has a flag too for assembler output but I don't know it off-hand). You'll learn a lot about how your code works (or fails to work as the case may be) and you'll get an answer quickly that is 100% guaranteed to match the tools and environment you're working in.

JUST MY correct OPINION
  • 35,674
  • 17
  • 77
  • 99
  • 3
    ... unless the thing we're testing happens to be undefined behavior, in which case the answer might only be 100% guaranteed to match the tools and environment at the moment it's tested. Furthermore, if the thing we're testing is implementation-defined, then to really get the answer, we'd have to test it on all possible implementations. (And we'd also have to *know* it's implementation-defined, but if we already knew that, we wouldn't have had to ask.) Furthermore, to test in this way, we'll need to know the rules for GNU assembler as well as the language we're actually trying to work in. – Rob Kennedy Jan 17 '11 at 15:12
3

What is the length of str array, and with how much 0s it is ending?

Let's find out:

int main() {
  char str[] = "Hello\0";
  int length = sizeof str / sizeof str[0];
  // "sizeof array" is the bytes for the whole array (must use a real array, not
  // a pointer), divide by "sizeof array[0]" (sometimes sizeof *array is used)
  // to get the number of items in the array
  printf("array length: %d\n", length);
  printf("last 3 bytes: %02x %02x %02x\n",
         str[length - 3], str[length - 2], str[length - 1]);
  return 0;
}
Fred Nurk
  • 13,952
  • 4
  • 37
  • 63
0
char str[]= "Hello\0";

That would be 7 bytes.

In memory it'd be:

48 65 6C 6C 6F 00 00
H  e  l  l  o  \0 \0

Edit:

  • What does the \0 symbol mean in a C string?
    It's the "end" of a string. A null character. In memory, it's actually a Zero. Usually functions that handle char arrays look for this character, as this is the end of the message. I'll put an example at the end.

  • What is the length of str array? (Answered before the edit part)
    7

  • and with how much 0s it is ending?
    You array has two "spaces" with zero; str[5]=str[6]='\0'=0

Extra example:
Let's assume you have a function that prints the content of that text array. You could define it as:

char str[40];

Now, you could change the content of that array (I won't get into details on how to), so that it contains the message: "This is just a printing test" In memory, you should have something like:

54 68 69 73 20 69 73 20 6a 75 73 74 20 61 20 70 72 69 6e 74
69 6e 67 20 74 65 73 74 00 00 00 00 00 00 00 00 00 00 00 00

So you print that char array. And then you want a new message. Let's say just "Hello"

48 65 6c 6c 6f 00 73 20 6a 75 73 74 20 61 20 70 72 69 6e 74
69 6e 67 20 74 65 73 74 00 00 00 00 00 00 00 00 00 00 00 00

Notice the 00 on str[5]. That's how the print function will know how much it actually needs to send, despite the actual longitude of the vector and the whole content.

L. Lopez
  • 1
  • 1
  • You are not answering the original question "what does the symbol mean". Please expand your answer to address the original question. – Michal Nov 08 '18 at 20:28
  • Other answers already mention that `str` is an array of size 7, including the accepted answer from seven years ago. Why repeat it yet again (without adding anything new)? – melpomene Nov 08 '18 at 20:37
  • @Michal, you do realize the original post has 3 questions, right? – L. Lopez Nov 10 '18 at 18:18
  • @melpomene. I do apologize for that. I expanded the answer and hopefully it clarifies further and adds more, as you seem to want. – L. Lopez Nov 10 '18 at 18:20
0

'\0' is referred to as NULL character or NULL terminator It is the character equivalent of integer 0(zero) as it refers to nothing

In C language it is generally used to mark an end of a string. example string a="Arsenic"; every character stored in an array

a[0]=A
a[1]=r
a[2]=s
a[3]=e
a[4]=n
a[5]=i
a[6]=c

end of the array contains ''\0' to stop the array memory allocation for the string 'a'.