36

Can you please help me to understand the output of this simple code:

const char str[10] = "55\01234";
printf("%s", str);

The output is:

55
34
Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
Radoslaw Krasimirow
  • 1,833
  • 2
  • 18
  • 28
  • 6
    Just interested in why someone downvoted this ^^ – dhein Apr 23 '15 at 09:33
  • It's also worth noting that it is because you provided a string in double-quotes that it was interpreted this way. Because if you had added it one character at a time, you would have had to choose between `'\\', '0', '1', '2'`, `'\0', '1', '2'`, `'\012'` (which also happen to change the size of the char table). – Tonio Apr 23 '15 at 10:05
  • 1
    That's one of those fun little facts about C that are not well known. The same way if you write an integer starting with a `0`, it is automatically interpreted as octal. `int i = 012;` is the same as `int i = 10;` or `printf("%d\n", 012);` outputs `10`. – Tonio Apr 23 '15 at 10:16
  • 1
    SSDD: the [Java version](http://stackoverflow.com/questions/19108008/what-are-the-java-semantics-of-an-escaped-number-in-a-character-literal-e-g) – Thomas Weller Apr 23 '15 at 11:58
  • @Tonio if it's not widely known then I wonder how in the world people are learning C, since things like the formats of literals are usually covered in the introductory material before you even get to *statements*. – hobbs Apr 24 '15 at 00:30
  • 1
    @hobbs From what I could see, very few C programming courses talk about octals in general... And to be honest, it's not overly useful, why would anyone want to write `012` instead of `10` or `\012` instead of `\n`? And I know you can probably dig up an example where it's useful, but in general? – Tonio Apr 24 '15 at 08:17

3 Answers3

42

The character sequence \012 inside the string is interpreted as an octal escape sequence. The value 012 interpreted as octal is 10 in decimal, which is the line feed (\n) character on most terminals.

From the Wikipedia page:

An octal escape sequence consists of \ followed by one, two, or three octal digits. The octal escape sequence ends when it either contains three octal digits already, or the next character is not an octal digit.

Since your sequence contains three valid octal digits, that's how it's going to be parsed. It doesn't continue with the 3 from 34, since that would be a fourth digit and only three digits are supported.

So you could write your string as "55\n34", which is more clearly what you're seeing and which would be more portable since it's no longer hard-coding the newline but instead letting the compiler generate something suitable.

unwind
  • 391,730
  • 64
  • 469
  • 606
  • Can you also add why it takes **12** as the octal number, and not **1** or **123**? – dragi Apr 23 '15 at 11:07
  • @Zaibis What? The page also says "Note that some three-digit octal escape sequences may be too large to fit in a single character; this results in an implementation-defined value for the character actually produced." which might be relevant for `\777` if that's what you're talking about (777 octal is 511 decimal, which is out of range for 8-bit character encodings). – unwind Apr 23 '15 at 11:20
17

\012 is an escape sequence which represents octal code of symbol:

012 = 10 = 0xa = LINE FEED (in ASCII)

So your string looks like 55[LINE FEED]34.

LINE FEED character is interpreted as newline sequence on many platforms. That is why you see two strings on a terminal.

Kruti Patel
  • 1,422
  • 2
  • 23
  • 36
myaut
  • 11,174
  • 2
  • 30
  • 62
  • Under which specification that is given? – dhein Apr 23 '15 at 09:33
  • 2
    I never stumbled over this before, but it is in the c99 ISO/IEC:9899 If you want it to add: it can be found under 6.4.4.4 Character constants point 3 and point 9 gives restriction to the values. – dhein Apr 23 '15 at 09:37
  • @Zaibis: thanks for the notes. I'm not sure quoting standard will make my answer more clear to OP. I'd prefer to keep wikipedia link. – myaut Apr 23 '15 at 09:41
  • I'm fine with that too. But the wiki link isn't 100% accurate by saying `\nnn` will take `n` as octal values (as point 9 states). I posted it that for as own answer. Not expecting to get the expected one. – dhein Apr 23 '15 at 09:44
  • Does it specify when an escape sequence must end? I remember once mistakenly thinking `\x` escape sequences to have a fixed length of 2, only to get an error "invalid escape sequence" in Visual Studio for, say, `"A\xBCD"`. Just reproduced it, VS2005 says `error C2022: '3021' : too big for character` But for the question's sequence, it says `warning C4125: decimal digit terminates octal escape sequence`, meaning it could be more standardized than hex sequences. – Medinoc Apr 23 '15 at 09:45
  • 1
    @Medionic: this is as I stated what is incorrect on wiki. point 9 states (as you can see in my answer) the escape sequence is limited to be a value fitting into range of unsigned char. what would be for hex limited to `\xFF` and for octal: `\377` – dhein Apr 23 '15 at 09:47
  • 1
    @Medionic: octal sequences can have 1, 2 or 3 digits according to a grammar: http://c0x.coding-guidelines.com/6.4.4.4.html – myaut Apr 23 '15 at 09:50
  • @Medinoc Detail: Hexadecimal sequences have no length limit. Octal escape sequences are limited up to 3. If the value exceeds the `unsigned char` range the result is an implementation-defined value. – chux - Reinstate Monica Apr 23 '15 at 15:54
  • @Zaibis "the escape sequence is limited to be a value fitting into range of unsigned char. what would be for hex limited to \xFF and for octal: \377" is not so. "Each octal or hexadecimal escape sequence is the longest sequence of characters that can constitute the escape sequence." §6.4.4.4 7 which is max 3 for octal and unlimited for hexadecimal. The result of values outside the `unsigned char` range is implementation-defined, but the escape sequence itself is not so limited. – chux - Reinstate Monica Apr 23 '15 at 15:57
6

\012 is a new line escape sequence as others stated already. (What might be, as chux absolute correct commented, different if ASCII isn't the used charset. But anyway it is in this notation an octal digit.)

this is meant by standard as it says for c99 in ISO/IEC 9899

for:

6.4.4.4 Character constants

[...]

3 The single-quote ', the double-quote ", the question-mark ?, the backslash \, and arbitrary integer values are representable according to the following table of escape sequences:

single quote' \'

double quote" \"

question mark? \?

backslash\ \

octal character \octal digits

hexadecimal character \x hexadecimal digits

And the range it gets bound to:

Constraints

9 The value of an octal or hexadecimal escape sequence shall be in the range of representable values for the type unsigned char for an integer character constant, or the unsigned type corresponding to wchar_t for a wide character constant.

dhein
  • 6,431
  • 4
  • 42
  • 74