49

I am a bit confused by the following code:

#include <iostream>

const char* f()
{
    const char* arr[]={"test"};
    return arr[0];
}

int main()
{
    auto x = f();
    std::cout << x;
}

In my opinion, this code should be UB (undefined behaviour). We return a pointer to a C-style array element inside a local scope. Things should go wrong. However, none of the compilers I tested with complain (I used -Wall -Wextra -pedantic on both g++ and clang). valgrind does not complain either.

Is the code above valid or is it UB as one would think?

PS: running it seems to produce the "correct" result, i.e. displaying "test", but that's not an indication of correctness.

psmears
  • 26,070
  • 4
  • 40
  • 48
vsoftco
  • 55,410
  • 12
  • 139
  • 252
  • 10
    FWIW the reason it "works" in practice is that the constant-string "test" is being stored in the executable's static-data area, and thus the string remains valid even after the function returns. (Whether or not it's guaranteed to work by the language spec is another issue, of course) – Jeremy Friesner Apr 03 '18 at 16:25
  • 2
    No harm in asking these things, and this question is well-written. +1. – Bathsheba Apr 03 '18 at 16:26
  • 2
    @JesperJuhl I know what UB is, and the fact that the moon may explode because of it. I'm asking if the code is really UB. And it looks like it's not. And it looks like quite a few out there believes the code is UB... So I think the question is useful. – vsoftco Apr 03 '18 at 16:33
  • 1
    @Bathsheba It looks we're still learning here :) Indeed the code **is not UB**. – vsoftco Apr 03 '18 at 16:46
  • 2
    @vsoftco: Didn't think it was, but it takes an expert like Barry to point out why. – Bathsheba Apr 03 '18 at 16:49
  • @Bathsheba Indeed – vsoftco Apr 03 '18 at 16:50
  • 1
    @TypeIA Yes, I saw it. Probably I got too used to the auto-judging the questions, based on seeing the question including pointer returned from function :/ – Algirdas Preidžius Apr 03 '18 at 16:51
  • @AlgirdasPreidžius No worries, I realized that the title may be a bit off/ do my homework kind of question, but it was really describing the problem. If you're convinced the answer is correct I think it's a good idea to delete your comment, so readers can find the answer quicker. – vsoftco Apr 03 '18 at 16:52
  • @Bathsheba I think you can just remove the comment now so we don't pollute the question – vsoftco Apr 03 '18 at 17:12
  • It's important to realize that, even if no available compiler issues a diagnostic and all available compilers generate machine code that does what one expected the program to do, that _still_ doesn't mean the program doesn't have UB. Most of the arguments about compilers being "overzealous" with optimizing based on UB concern constructs that were _always_ UB since 1989, but compilers have only recently become sophisticated enough to notice. And in several cases where the standard says something is UB, it's because precise diagnostics would involve solving the Halting Problem. – zwol Apr 03 '18 at 17:47
  • @zwol Sure, the question here is whether this is UB or not. I'd have guessed it's UB, but it's not, as far as I can see from the answer and comments. – vsoftco Apr 03 '18 at 17:49
  • @vsoftco Right, in this case it happens not to be. I'm just saying that the test you performed is inconclusive. – zwol Apr 03 '18 at 18:00
  • Why use char* in C++? use std::string instead. They are memory safe. – dmg Apr 04 '18 at 03:55
  • You never return a pointer to the local array. You return a copy of a value which was stored in a local array , there's nothing wrong with that – M.M Apr 04 '18 at 04:00

2 Answers2

79

No, it's not UB.

This:

const char* f()
{
    const char* arr[]={"test"};
    return arr[0];
}

Can be rewritten to the equivalent:

const char* f()
{
    const char* arr0 = "test";
    return arr0;
}

So we're just returning a local pointer, to a string literal. String literals have static storage duration, nothing dangles. The function really is the same as:

const char* f()
{
    return "test";
}

If you did something like this:

const char* f() {
    const char arr[] = "test"; // local array of char, not array of char const*
    return arr;
}

Now that is UB - we're returning a dangling pointer.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
Barry
  • 286,269
  • 29
  • 621
  • 977
  • 1
    "String literals have static storage duration" - I know all the compilers do this; but is it really in the standards that string literals MUST be with static duration? – UKMonkey Apr 03 '18 at 16:37
  • @UKMonkey Yep, added the reference. – Barry Apr 03 '18 at 16:37
  • Not to be pedantic, but the same thing should be true for C, right? Initially that's how I bumped into the issue. – vsoftco Apr 03 '18 at 16:41
  • @vsoftco As far as I'm aware. I'm not 100% sure what C does with its literals, I'm a C++ dog all the way - but that seems like a reasonable guess at least. – Barry Apr 03 '18 at 17:00
  • @Barry Thanks Barry, I can just search on a C standard now. I'm quite convinced things are OK now. – vsoftco Apr 03 '18 at 17:07
  • 3
    Not to be pedantic, but that's not really "equivalent". Functionally, in this context, sure, but an array is an array is an array. ;) – Lightness Races in Orbit Apr 03 '18 at 17:08
  • @LightnessRacesinOrbit :)))) Yes I totally agree. – vsoftco Apr 03 '18 at 17:11
  • Although the quality of this answer is above my pay grade, I've made an edit to soften the "equivalent". I was going to slip in a "broadly" or "functionally" but I think a higher power should do that. – Bathsheba Apr 03 '18 at 17:15
  • Well, it does result in exactly the same instructions generated and exactly the same data in memory. So seems quite equivalent to me. – jpa Apr 03 '18 at 19:05
  • @jpa In this context, the question is rather whether the standard says it's equivalent, not what some particular compiler did with it :) – BartoszKP Apr 03 '18 at 19:18
  • @BartoszKP Yes, and as far as I can tell, the C standard leaves quite little space for any differences here. sizeof(arr) must equal sizeof(arr0) by the standard. – jpa Apr 03 '18 at 19:21
  • 4
    @LightnessRacesinOrbit, In both cases, there is a local pointer to `const char`, in both cases the local pointer is initialized to point to a literal string, and in both cases its value is returned. The only difference is, in the original example, the local pointer is a member of a one element array, and in the alternate version, it's a scalar. So what if there is some sense in which those are not "equivalent"? What matters here is not the exact type of the local variable. What matters is whether or not the string literal will remain valid after the function returns. – Solomon Slow Apr 03 '18 at 21:44
  • 2
    C does also guarantee that string literals have static storage duration, see http://port70.net/~nsz/c/c11/n1570.html#6.4.5p6 . – zwol Apr 04 '18 at 02:18
3

The array arr has local storage duration and will disappear at the end of scope. The string literal "test" however is a pointer to a static storage location. Temporarily storing this pointer in the local array arr before returning it doesn't change that. It will always be a static storage location.

Note that if the function were to return a C++ style string type instead of a C style const char *, the additional conversion/bookkeeping would likely leave you with a value limited in lifetime according to C++ temporary rules.