1

Are both \ and 0 characters stored in the same location or in different locations in the end of the string?

main()
{
    char x[]="Hello\0";
    char y[]="Hello12";
    char z[]="Hello\012";
    char w[]="Hello1234";

    printf("%d %d %d %d", sizeof(x), sizeof(y), sizeof(z), sizeof(w));
}

Output:

7 8 7 10

Please explain the output of the code.

Hari
  • 121
  • 1
  • 18
  • Note: better to use a matching format specifier: `printf("%zu ..., sizeof ...)`. – chux - Reinstate Monica Oct 02 '14 at 14:32
  • What do u want me to use? Doesn't sizeof() return int values? – Hari Oct 02 '14 at 14:35
  • 1
    Use `printf("%zu", sizeof(x))`. `sizeof()` returns an unsigned integer type of `size_t`. `size_t` is defined in ``. In some environments, this integer takes up the same space as `int` and so `printf("%d", sizeof(x))` does not cause trouble as long as `sizeof(x) <= INT_MAX`. In other environments, the space difference will immediately cause trouble. See http://stackoverflow.com/questions/2524611/how-to-print-size-t-variable-portably – chux - Reinstate Monica Oct 02 '14 at 14:39
  • Curious: Why did you think `sizeof()` returned `int` versus any other type? – chux - Reinstate Monica Oct 02 '14 at 14:45
  • I was not sure of what it returns. I randomly picked int data type , just to clarify with you :). Thank you.. – Hari Oct 02 '14 at 14:48
  • Note: If you get into a situation where the type of the number is not well known or varies amongst platforms, like `clock_t, time_t` or the format specifier varies like with `int64_t`, here is a trick: cast to the widest integer type and use `'j'`. Example: `time_t t; printf("%jd", (intmax_t) t);`. There are still a few issues with this approach, but it does solves many of them. – chux - Reinstate Monica Oct 02 '14 at 15:29

4 Answers4

8

\0 in a C string is a single character, ASCII value 0. All C string literals also include an implicit terminating \0 character, regardless of what else is included in the string (even another \0).

\012 is the octal character ASCII 10 (Line Feed)

So:

char x[]="Hello\0";      // 5 letters + your \0 + implicit \0 == 7
char y[]="Hello12";      // 7 letters + implicit \0 == 8
char z[]="Hello\012";    // 5 letters + \012 + implicit \0 == 7
char w[]="Hello1234";    // 9 chars + implicit \0 == 10
Paul Roub
  • 36,322
  • 27
  • 84
  • 93
1

No, the \0 is an octal number and takes up one character position:

Array  Contents   Size
x      Hello\0    5 for the characters, one for the explicit \0, one for the implicit null terminator
y      Hello12    7 for the characters, one for the implicit null terminator
z      Hello\012  5 for the characters, one for the \012, one for the implicit null terminator
w      Hello1234  9 for the characters, one for the implicit null terminator
APerson
  • 8,140
  • 8
  • 35
  • 49
  • this answer is incorrect. \0 is not octal, it's an ASCII character. – Woodrow Barlow Oct 02 '14 at 13:31
  • 3
    @WoodrowBarlow No. C parses it as octal, but since it's in a character string it gets treated as a character. Internally, of course, everything's a number, so there's no difference between the octal `\0` and the ASCII "null" character. – APerson Oct 02 '14 at 13:32
  • @APerson I amend my statement: it doesn't matter if it's octal. it's zero, no matter what base it's parsed as, and that doesn't affect the number of bytes it consumes. there is, in fact, a large difference between \0 (dec:0) and H (dec:72). – Woodrow Barlow Oct 02 '14 at 13:35
  • From the perspective of compilers that treat \ followed by exactly 1, 2 or 3 octal digits(digits 0-7) - `\0` will appear from the compilers perspective as a 0 value. `\00` and `\000` would be equivalent as well. But at the end of the day the compiler sees `\0` it will equate to the value `0`. But I have to agree that most lexers for C will treat `\0` as if it was an octal number with 1 digit '0' – Michael Petch Oct 02 '14 at 18:12
1

First, you are using implicit int return-type. Please desist.

Next, string literals are parsed thus:

First convert to characters, then concatenate neighboring strings, finally add an implicit sentinel 0.

char x[]="Hello\0";   // 'H' 'e' 'l' 'l' 'o' 0                 sentinel-0
char y[]="Hello12";   // 'H' 'e' 'l' 'l' 'o' '1'   '2'         sentinel-0
char z[]="Hello\012"; // 'H' 'e' 'l' 'l' 'o' '\012'            sentinel-0
char w[]="Hello1234"; // 'H' 'e' 'l' 'l' 'o' '1'   '2' '3' '4' sentinel-0

The escape-sequences used are both octal:

'\0'  for character 0
'\012'  for character 10
Deduplicator
  • 44,692
  • 7
  • 66
  • 118
1

As others have stated \0 is an escape character and \012 is a single escape character. In addition, all string in C automatically have a \0 appended.

Array Index:  0   1   2   3   4   5   6   7   8   9
          x:  H   e   l   l   o  NUL NUL
          y:  H   e   l   l   o   1   2  NUL
          z:  H   e   l   l   o   LF NUL
          w:  H   e   l   l   o   1   2   3   4  NUL

NUL and LF are the names given to octal 0 and octal 12 ASCII characters. see: http://www.asciitable.com/

Jeffery Thomas
  • 42,202
  • 8
  • 92
  • 117