1

I have the following code:

# include <iostream>

using std::cout;

const int ARRAY_SIZE = 15;

union one4all
{
    int int_value;
    long long_value;
    double double_value;
    char char_value[ARRAY_SIZE];
};

int main()
{
    one4all x;
    for (int i = 0; i < ARRAY_SIZE; ++i)
    {
        x.char_value[i] = (int)(i + 97);
    }

    cout << x.char_value << '\n';

    system("pause");
}

This code was printing gibberish after the last character value. Then, after some research it says that the reason is because it cannot detect a terminating character and hence, it keeps on printing.

https://i.stack.imgur.com/7MFvS.png

So, I updated my code to this:

# include <iostream>

using std::cout;

const int ARRAY_SIZE = 15;

union one4all
{
    int int_value;
    long long_value;
    double double_value;
    char char_value[ARRAY_SIZE];
};

int main()
{
    one4all x;
    for (int i = 0; i < ARRAY_SIZE; ++i)
    {
        x.char_value[i] = (int)(i + 97);
    }

    x.char_value[ARRAY_SIZE - 1] = '\0';

    cout << x.char_value << '\n';

    system("pause");
}

This worked perfectly because it has a terminating character at the end.

https://i.stack.imgur.com/WVsBB.png

This raised questions:

  • Why did it keep printing after the 15th character? I know that it could not detect any terminating character, but hey! it knew the size of the character array?
  • If for some reason it cannot detect the size of the array, why did it not bombard my screen with more gibberish characters and what motivated it to stop printing the gibberish characters?
  • Programming Rage
    • 403
    • 1
    • 4
    • 18
    • Put a `NUL` (`'\0')`character after the last character you want to see. – πάντα ῥεῖ Sep 04 '20 at 19:46
    • 1
      That is not the question I asked. I know that already. – Programming Rage Sep 04 '20 at 19:47
    • 2
      *"what motivated it to stop printing the gibberish characters"* - passing a char array without terminating null character to `std::cout` invokes *undefined behavior* - whatever is printed is printed (in practice most likely until it stumbles upon a `0` in memory – UnholySheep Sep 04 '20 at 19:47
    • _"but hey! it knew the size of the character array?"_ That doesn't matter. The compiler just does what you tell it to do. – πάντα ῥεῖ Sep 04 '20 at 19:48
    • _"and what motivated it to stop printing the gibberish characters?"_ it found a `NUL` byte by chance. – πάντα ῥεῖ Sep 04 '20 at 19:49
    • @πάνταῥεῖ you did not understand the question. I am asking why did the compiler not print more gibberish characters? If it found a NULL byte by chance, then on different runs of the program it should print different numbers of gibberish characters. Why is it the same number of gibberish characters on every run of the program? – Programming Rage Sep 04 '20 at 19:54
    • Someone marked this question as duplicate, without even reading the answers on that post. On that post the answers are mainly telling to use `\0` at the end. I am asking why? – Programming Rage Sep 04 '20 at 19:55
    • @ProgrammingRage _"Why is it the same number of gibberish characters on every run of the program?"_ Because the memory is initialized in the same way for every run. – πάντα ῥεῖ Sep 04 '20 at 19:56
    • @πάνταῥεῖ right, but I am no where initializing the NULL byte in the memory anywhere – Programming Rage Sep 04 '20 at 19:56
    • 1
      @πάντα ῥεῖ That's not a good duplicate, as it focuses on cases where the array will automatically have NUL padding when the assigned content is shorter than the array size. This question is more about decay to `const char*` and the expectations of `cout`, which doesn't even feature in the other question you linked as a duplicate. – Tony Delroy Sep 04 '20 at 20:07
    • @ProgrammingRage There's a lot more going on in the program than just what you wrote. Think about `std::cout` where do you think does that come from? – πάντα ῥεῖ Sep 04 '20 at 20:07
    • @πάνταῥεῖ i know that there is a lot going on behind the screen. And ```cout``` comes from the header file ```#include```. How does that remotely answer my question? – Programming Rage Sep 04 '20 at 20:12
    • @ProgrammingRage `std::cout` is an **initialized** global variable. – πάντα ῥεῖ Sep 04 '20 at 20:14

    1 Answers1

    2

    Why did it keep printing after the 15th character? I know that it could not detect any terminating character, but hey! it knew the size of the character array?

    "it" might if you mean the compiler generally, but you called a streaming function operator<<(std::ostream&, const char*) - which is only matched because the array has decayed to a pointer and the length is lost. That's just the way the Standard Library function has been written.

    If it tried to be smarter, it's not clear what would work best anyway: say the Standard Library provided...

    template <size_t N>
    std::ostream& operator<<(std::ostream&, const char(&)[N]);
    

    ...so knowledge of the array size would be available - should it always stream a number of characters determined by the character array size, or stop early if it hits a NUL?

    It'd still be a pain that if you called a non-template function with a char[] argument, that will tend to decay to a char* function argument, and the called function couldn't then provide the array size if it tries to stream the value.

    Summarily, the idea of a streaming operator utilising knowledge of the array size just doesn't hold up as text buffers and pointers to them are passed around a program.

    Tony Delroy
    • 102,968
    • 15
    • 177
    • 252
    • Is it not common sense to use the knowledge of array size to be a terminating factor? Why would I ever want to print something beyond the size of the character array? – Programming Rage Sep 04 '20 at 19:59
    • There are times when you might, for example - there's a common but not-Standard-compliant practice of putting a `char var_length_field[0]` field at the end of a `struct`, so if the code using that struct is careful it can put it in a larger memory area and access content just past its end using e.g. `var_length_field[n]` for however many extra bytes the program logic guarantees it. – Tony Delroy Sep 04 '20 at 20:01
    • More generally though, it's so much more common to have variable-length strings - even in a fixed length array buffer - that it's just expected you'll terminate them. If you don't want that, use `cout.write(x.char_value, sizeof x.char_value);` – Tony Delroy Sep 04 '20 at 20:03
    • Wait, how is it accessing memory outside of its designated memory area? If that is possible, what is the point of have the ```new``` keyword because we can also use this ideology to create dynamic arrays, if they can access content just past its end? – Programming Rage Sep 04 '20 at 20:06
    • 1
      You can do things like `char* p = new char[200]; auto* p_ms = new(p) MyStruct{arg1, arg2}; p_ms->var_length_field[2] = 'c';` safely as long as the system's minimum alignment for pointers returned by `new` is at least as large as that needed by `MyStruct` (otherwise there are other techniques), and if `sizeof MyStruct` is no more than 197 bytes, such that `var_length_field[2]` isn't outside the 200 allocated bytes. – Tony Delroy Sep 04 '20 at 20:12
    • 1
      @ProgrammingRage [Array Decay](https://stackoverflow.com/questions/1461432/what-is-array-to-pointer-decay). You know the size of the array, but the function doing the printing... doesn't. That said, there are some really cool template tricks to make the printing function infer the size, but `<<` doesn't use any of them. It just sees a `char *` and goes looking for the terminator. – user4581301 Sep 04 '20 at 20:12
    • 1
      In other words, if you start doing things like that, your code has to orchestrate the allocation and usage very carefully - taking full responsibility to stay within the bounds of properly allocated memory. – Tony Delroy Sep 04 '20 at 20:13
    • Aah, better not go there! – Programming Rage Sep 04 '20 at 20:14