7

The C programming book says that my string must be null terminated to print it using printf, but still the following program prints the string despite it being non-null terminated!

#include <stdio.h>
#include <stdlib.h>

int main() {
    int i;
    char str[10];
    for(i = 0; i < 10; i++) {
        str[i] = (char)(i+97);
    }

    printf("%s", str);
}

I am using the Code::Blocks IDE.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Nikunj Banka
  • 11,117
  • 16
  • 74
  • 112
  • 6
    You just happened to get lucky, there was a `0` somewhere right after that array. You've just exploited something seasoned programmers call "undefined behavior". Don't count on being able to do that everywhere, most programs would crash. – Chris Eberle Dec 01 '12 at 04:38
  • I have run the loop 1000 times but it always runs the same without any issues . Should I consider changing my compiler from codeblocks ? thanks . – Nikunj Banka Dec 01 '12 at 05:10
  • 4
    undefined doesn't mean non-repeatable. Try compiling with different flags, or compiling on a different architecture. – Chris Eberle Dec 01 '12 at 05:17
  • 4
    This is just like drunk driving. If you do that, you will get away most of the times. You are just endangering yourself and others. – Jens Gustedt Dec 01 '12 at 08:12
  • http://stackoverflow.com/questions/3767284/using-printf-with-a-non-null-terminated-string – Ciro Santilli OurBigBook.com May 12 '17 at 16:36

5 Answers5

10

It's undefined behavior to read beyond the bounds of an array. You were actually unlucky it didn't crash. If you run it enough times or call it in a function, it may (or may not) crash.

You should always terminate the string, or use a width specifier:

printf("%.10s", str);
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
iabdalkader
  • 17,009
  • 4
  • 47
  • 74
5

Whatever is after the 10th element of str happens to be a null. That null is outside the defined bounds of the array, but C doesn't have array bounds checking. It's just luck in your case that that's how it worked out.

Andy Lester
  • 91,102
  • 13
  • 100
  • 152
  • thanks but now I have run the same code in a loop 1000 times and each time it works without any issues . So now shall I consider this a matter of concern with the compiler itself ?(that it is automatically placing a null character ) – Nikunj Banka Dec 01 '12 at 04:59
  • 3
    Any time you find yourself wondering if the compiler is broken, you aren't looking hard enough at your own code or your own understanding of the problem. As a novice, *always* assume that the problem is with you. – Andy Lester Dec 01 '12 at 05:02
  • #include #include #include int main(){ int i ; for(i = 0 ; i < 1000 ; i++ ) { char str[10] ; for(i = 0 ; i < 10 ; i++ ) { str[i] = (char)(i+97) ; } printf("%s",str) ; } return 0 ; } – Nikunj Banka Dec 01 '12 at 05:03
  • 3
    @user189535: So now declare another array on the next line after `char str[10]`. Call it `char str2[10]`. Fill it with 'B' or whatever you want. Now print str and see what happens. – indiv Dec 01 '12 at 05:08
  • yeah that caught the problem ! Now it is printing a garbage value too! how did you figure that out ? – Nikunj Banka Dec 01 '12 at 05:15
  • 1
    @NikunjBanka: Well, technically it was luck that it printed garbage. I meant to say to declare an array on the line *before* str[10]. Depends on your machine architecture. But anyway, it has to do with how local variables come one after the other in memory. So when `printf` goes beyond your array boundary into la-la land, it ends up printing another variable in memory. But this is outside the scope of C. The language C just says the behavior is undefined if you go off into la-la land. – indiv Dec 01 '12 at 05:28
1

According to the C standard, the printf function prints the character in the string until it finds a null character. Otherwise, after the defined array index, what it will do is not defined.

I have tested your code. And after printing "abcdefghij", it prints some garbage value.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Debobroto Das
  • 834
  • 6
  • 16
  • I am using codeblocks IDE and I do not get any garbage value . Which compiler are you using . Should I consider changing my compiler ? thanks . – Nikunj Banka Dec 01 '12 at 05:02
  • 3
    **Your code is invalid. Your code's behavior is undefined.** The C spec said that reading beyond the end of the array is undefined behavior, meaning that the compiler can make anything happen. The compiler is not broken. Your code is. – Andy Lester Dec 01 '12 at 05:10
1

If you do other things before that call, your stack area will contain other data than an unused one. Imagine that:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int use_stack(void) {
   char str[500];
   memset(str, 'X', sizeof(str));
   printf("Filled memory from %p to %p.\n", &str, &str + sizeof str);
}

void print_stuff() {
    int i;
    char str[16]; // Changed that so that 10..15 contain X
    for(i = 0; i < 10; i++) {
        str[i] = (char)(i+97);
    }

    printf("%s<END>", str); // Have a line break before <END>? Then it comes from i.
    printf("&str: %p\n", &str);
    printf("&i: %p\n", &i);
    // Here you see that i follows str as &i is &str + 16 (0x10 in hex)
}

int main() {
    use_stack();
    print_stuff();
}

your stack area will be full of Xes, and printf() will see them.

In your situation and your environment, the stack is coincidentally "clean" on program start.

This may or may not happen. If the compiler puts the variable i immediately after the array, your data will nevertheless be NUL-terminated, because the first byte is the value of i (which you happen to print as well—it might be a libne break in your case—and the second byte is a NUL byte. Even if this is the case, your code invokes UB (undefined behaviour).

Can you have a look (by piping the program output into hexdump or alike) if your output contains a 0A character? If so, my guess is correct. I just tested it, and on my compiler (GCC) it seems to be the way.

As said, nothing you should rely on. If you see a line break before <END>, my guess was right. And if you have a look at the pointers now being printed, you can compare their addresses in memory.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
glglgl
  • 89,107
  • 13
  • 149
  • 217
  • I don't know what a hexdump is but the code you gave runs without crashing or printing any garbage values . I am running the code on code blocks IDE on windows 7 . – Nikunj Banka Dec 01 '12 at 06:19
  • @NikunjBanka Yeah, it was because of the said fact that `i` follows `str` on the stack and because `str[]` is fully used. If you do `str[15]`, the elements 10..14 of `str[]` will be filled with `X`. – glglgl Dec 01 '12 at 08:10
0

Because in debug mode, *(str+10) and the whole unused space have an initialized value '0', so it seems like it's 0 terminated.

clang -O0 t.c -o t # Compile in debug mode
./t

Output:

abcdefghij

And:

clang -O2 t.c -o t # Compile with optimization
./t

Output:

abcdefghij2÷d=
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Haocheng
  • 1,323
  • 1
  • 9
  • 14
  • But the `NUL` character is expected to sit behind the array. So it doesn't matter if `str[]` is initialized to `\0` (where do you think that happens?), the oddity lies beyond the bounds of the array. – glglgl Dec 01 '12 at 06:05
  • I mean `*(str+10)`, not the whole array in the bounds. – Haocheng Dec 01 '12 at 13:11