4

Playing around with templates, I ran into an interesting phenomenon w.r.t to array and defining its size, which I thought is not allowed in C++.

I used a global variable to define the size of an array inside main(), and it worked for some reason (see code below)

1) Why does this even compile? I thought only constexpr may be used for array size

2) Suppose the above is valid, it still does not explain why it works even when sz = 8 which is clearly less than the size of the character string

3) Why are we getting that weird '@' and '?'. I tried different combination of strings, for example, only number characters ("123456789") and it did not appear.

Appreciate any help. Thanks.

Here is my code

#include <iostream>
#include <cstring>


using namespace std;


int sz = 8;

int main()
{
    const char temp[sz] = "123456abc"; //9 characters + 1 null?

    cout << temp << endl;

    return 0;
}

Output:

123456ab� @

User 10482
  • 855
  • 1
  • 9
  • 22
  • 2
    Which compiler are you using? Is it likely an extension allowing it. – Retired Ninja May 31 '19 at 17:03
  • I am using online compiler here : https://www.onlinegdb.com/online_c++_compiler It doesn't say version. – User 10482 May 31 '19 at 17:09
  • 1
    gdb is the debugger for gcc. gcc allows non-constant array sizes (Search term: Variable Length Arrays) by compiler extension. – user4581301 May 31 '19 at 17:13
  • 1
    Looks like the Variable length array support has a side effect that kills the error gcc normally spits out for trying to overfill an array. I have to confirm with documentation to see whether you should expect an error at all. Seems like a good place for one, but I can't remember if an error for this is required. – user4581301 May 31 '19 at 17:18
  • Since Variable Length Array is non-standard behaviour, exactly what happens is either documented by GCC (which may be borrowed from the C standard where variable length arrays are legal. Optional, but legal) or not documented at all. What it looks like happened is the array took the first 8 characters and stopped before the null terminator. Printing an unterminated string often results in behavior like this – user4581301 May 31 '19 at 17:22
  • Is it a fail-safe built in the compiler? like printing '@' upon detecting NON-null terminated c type string. However, it does not print the same with number characters as I said in question 3. – User 10482 May 31 '19 at 17:27
  • @User10482 No, printing a character array that isn't null terminated is undefined behavior, anything can happen. It's likely that the two last characters are just what happens to be in memory at that location at that time. – François Andrieux May 31 '19 at 17:29
  • @FrançoisAndrieux It was uncannily consistent with results for multiple test cases. Put in alphabetical characters and the weird symbols appear, just number characters and it went away. No hint of randomness. – User 10482 May 31 '19 at 17:34
  • 1
    @User10482 Undefined doesn't mean random. Your observations maybe consistent but the behavior is definitely not. It just means there aren't enough variations in the conditions of your observations. Try with a different compiler, on a different platform or even just with different flags. Also try in the context of a larger project, changes in apparently unrelated portions of your code can change how your UB manifests itself. – François Andrieux May 31 '19 at 17:36
  • 1
    Add some compiler flags to that onlline compiler. These should help `-Wall -Wextra -Werror -pedantic -pedantic-errors` and should give you `main.cpp:12:23: error: ISO C++ forbids variable length array ‘temp’ [-Wvla]` – Ted Lyngmo May 31 '19 at 17:39
  • 1
    @TedLyngmo Yep. That gave an error. I should refrain from using it. Makes me wonder what else non-standard stuff slipped through when I was coding there. Thanks everyone. Appreciate the help. – User 10482 May 31 '19 at 18:06
  • 1
    @User10482 You can disable non-standard extensions by compiling with `-std=c++14` rather than the default `-std=gnu++14` Replace `14` with `11` or `17` depending on what language standard you use. Then your VLA will fail to compile (and possibly more stuff). I highly recommend doing this. – Jesper Juhl May 31 '19 at 19:12
  • 2
    @User10482 That's the biggest problem with C++ in my opinion. You can't assume your code is correct because it compiles and does what you want. You need to *know* every line your wrote is correct and there's a lot to know in C++. This isn't limited to using extensions, UB is easy to come by. – François Andrieux May 31 '19 at 19:12

1 Answers1

2

1) Why does this even compile? I thought only constexpr may be used for array size

The asker is correct that Variable Length Arrays (VLAs) are not Standard C++. The g++ compiler includes support for VLAs by extension. How can this be? A compiler developer can add pretty much anything it wants so long as the behaviour required by the Standard is met. It is up to the developer to document the behaviour.

2) Suppose the above is valid, it still does not explain why it works even when sz = 8 which is clearly less than the size of the character string

Normally g++ emits an error if an array is initialized with values that exceed the size of the array. I have to confirm with documentation to see whether the C++ Standard requires an error in this case. It seems like a good place for an error, but I can't remember if an error is required.

In this case it appears that the VLA extension has a side effect that eliminates the error g++ normally spits out for trying to overfill an array. This makes a lot of sense since the compiler doesn't know the size of the array, in the general case, at compile time and cannot perform the test. No test, no error.

None of this is covered by the C++ Standard because VLA is not supported.

A quick look through the C standard, which does permit VLAs, didn't turn up any guidance for this case. C the rules for initializing VLAs are pretty simple: You can't. The compiler doesn't know how big the array will be at compile time, so it can't initialize. There may be an exception for string literals, but I haven't found one.

I also have not found a good description of the behaviour in GCC documentation.

clang produces the error I expect based on my read of the C standard: error: variable-sized object may not be initialized

Addendum: Probably should have checked Stack Overflow first and saved myself a lot of time: Initializing variable length array .

3) Why are we getting that weird '@' and '?'. I tried different combination of strings, for example, only number characters ("123456789") and it did not appear.

What appears to be happening, and since none of this is standard I have no quotes to back it up, is the string literal is copied into the VLA up to the size of the VLA. The portions of the literal past the end of the VLA are silently discarded. This includes the null terminator, and the behaviour is undefined when printing an unterminated char array.

Solution:

Stick to Standardized behaviour where possible. The major compilers have options to warn you of non-Standard code. Use -pedantic with compilers using the gcc options and /permissive- with recent versions of Visual Studio. When forced to step outside the Standard, consult the compiler documentation or the implementers themselves for the sticky-or-undocumented bits.

If you don't get good answers, try to find another path.

user4581301
  • 33,082
  • 7
  • 33
  • 54