-5
#include <string.h>
#include <stdio.h>

int main(void)
{
    char str[10] = "testonetwo";
    printf("str [%s]\n", str);
    return (0);
}

I tried printing that string str and expected undefined behaviour but it printf str normally.

  • 5
    Sometimes undefined behavior looks similar to working properly. – Stephen Newell Dec 09 '22 at 12:34
  • 6
    I like the concept of "expecting undefined behaviour" :-). – Jabberwocky Dec 09 '22 at 12:35
  • `char str[10] = "testonetwo";` is **invalid** and won't work in C++. Also see [Why is the phrase: "undefined behavior means the compiler can do anything it wants" true?](https://stackoverflow.com/questions/49032551/why-is-the-phrase-undefined-behavior-means-the-compiler-can-do-anything-it-wan) – Jason Dec 09 '22 at 12:35
  • @StephenNewell but it works properly everytime. – Youness Asserare Dec 09 '22 at 12:42
  • If I had a _wild_ guess at what may cause it to run fairly consistently: Your `char str[10]` is really, in essence, a `char*`. It points to the string literal `"testonetwo"` which _is_ null terminated, and stored in the data/text section of your executable. When `printf` starts to read this string, it may be hitting the literal's null terminator, despite it being outside of the bounds of what `str[10]` suggested. This is undefined behavior still (and more wonk will show up down the line), but overall I would guess that your code would stay consistent with just what you've shown. – Rogue Dec 09 '22 at 12:48
  • 3
    @Rogue: `str` does not point to a string literal. What often happens in practice, in the absence of optimization, is that `str` is created on the stack and initialized with the ten bytes “testonetwo”, and the following byte happens to be a null byte because this is a simple program that has not cluttered the stack with additional data. – Eric Postpischil Dec 09 '22 at 12:53
  • @JasonLiam True, but this question is about C where the initialization is perfectly valid, see https://stackoverflow.com/questions/13490805/why-does-gcc-allow-char-array-initialization-with-string-literal-larger-than-arr – nielsen Dec 09 '22 at 12:54
  • @Rogue You're misunderstanding things. `char str[10]="testonetwo";` is equivalent to `char str[10]={'t','e','s','t','o' ,'n','e','t','w','o'};` -- an on-the-stack (not any section, assuming the declaration is at block scope) array (≠ pointer) without the nul terminator. – Petr Skocik Dec 09 '22 at 12:54
  • 1
    @nielsen: The question was originally tagged with C++ as well as C. – Eric Postpischil Dec 09 '22 at 12:55
  • This is kind of the same thing as blowing up your own car on purpose then ask why one tire landed on your lawn and not in your neighbour's garden. First of all, who cares. Then as for why, well... probably there were padding bytes inserted after your 10 byte long array and maybe they had value zero? Or maybe we are writing out of bounds into the stack frame, killing canaries left and right. Who knows. Now what are we to do with that information? Store a secret message there? – Lundin Dec 09 '22 at 13:01
  • @YounessAsserare I'm not angry? Although some thousand beginners before you already created similar "spectacular" programs and insisted on knowing why their programs didn't misbehave when they invoked undefined behavior. [What is undefined behavior and how does it work?](https://software.codidact.com/posts/277486) – Lundin Dec 09 '22 at 13:41
  • @YounessAsserare One of the hardest lessons to learn in programming — many programmers never learn this — is that there is a huge difference between a program that works, versus one that works *for the right reasons*. It is possible for a program to (seem to) work hundreds of times, even though it contains glaring bugs. Thus you can not use "it works" as any kind of proof that "this program is correct". – Steve Summit Dec 09 '22 at 14:10
  • *how does printf know the end of a string when the null terminator is not part of the string?* It turns out this is a meaningless question. In C there is *no such thing* as a "string when the null terminator is not part of the string". All strings contain, by definition, a null terminator. An array of characters without a null terminator is just an array of characters, it is not a string, and therefore `printf` cannot print it reliably. – Steve Summit Dec 09 '22 at 14:14

1 Answers1

2

how does printf know the end of a string when the null terminator is not part of the string?

printf does not know where the array str ends. If the call printf("str [%s]\n", str) is implemented with an actual call to a printf implementation (rather than optimized by the compiler to some other code), then str is converted to a pointer to its first element, and only this pointer is passed to printf. printf then examines memory byte-by-byte. For the first ten bytes, it sees elements of str. Then it access memory outside of str. What happens then is usually one of:

  • There are additional non-null bytes in memory, and printf writes them too, until it finds a null character.
  • There is a null byte in memory immediately after str, and printf prints only the bytes in str.
  • The memory printf tries to access is not mapped, and a segment fault or other exception occurs.

If the compiler did optimize the call into other code, other behaviors may occur. The C standard does not impose any requirements on what may happen.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312