0

I have come across a very confusing thing. I have made a local char array in a function, and return the array name, but the return value is null?

char* get_string(){
    char local[] ="hello world\n";
    cout<<"1"<<(int)local<<endl;//shows a reasonable value
    return local;
}

int main(){
    char* p = get_string();
    cout<<"2"<<(int) p<<endl;//shows 0
    return 0;
}

I know it is not good to use a local variable, because when the function returns, the stack part that the local variable occupies would be used by other function calls, but I think this should return the address of the first element of the array, should not be null. I'm very confused; any help would be appreciated.

I use QT 32 version, compiler is MSVC2015 (I am at baby stage about compiler; not even sure that MSVC is compiler name).

--updated, I think this question is not a duplicate of this Returning an array using C I know it is not valid to use atomic/local storage outside the scope, and my question is why the return value becomes 0 despite its inappropriate use.

--ok, thank you, everyone. I think I found the answer. I see the assembly code of the function char* get_string(), the last part of the assembly code is this

0x44bce7  mov $0x0,%eax 
0x44bcec     leave
0x44bced     ret

I think this is implementation defined, hard coded in the compiler, if I return the address of a local variable, then %eax or %rax is set to 0.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
wangz
  • 45
  • 9
  • 5
    @wangz The code has undefined behavior because the array is not alive after exiting the function. – Vlad from Moscow Sep 12 '19 at 14:11
  • 2
    Let this be a lesson that "undefined behaviour" means *anything can happen*. It does not mean "oh probably the stack bla bla this and that etc." – M.M Sep 12 '19 at 14:13
  • Undefined behavior is like that - undefined and hard to predict. If I had to guess, I'd say compiler sees that you are returning an address of local variable and substitutes it with nullptr to allow the calling code to catch this error. – SergeyA Sep 12 '19 at 14:14
  • First of all pointer may not fit in `int`, second there is nothing to help with, your expectation is simply wrong. – Slava Sep 12 '19 at 14:22
  • @VladfromMoscow is returning address of local variable UB? Could you point out the standard rule for me? My UB sanitiser doesn't catch that. – eerorika Sep 12 '19 at 14:23
  • @eerorika I was wrong. I though he outputs the string itself. The problem is that he uses casting to int.;) – Vlad from Moscow Sep 12 '19 at 14:24
  • @VladfromMoscow That's a problem (makes the program ill-formed), but the result reproduces even when casting to `void*` (in GCC; not reproduced in Clang; I didn't test MSVC). – eerorika Sep 12 '19 at 14:26
  • 1
    Possible duplicate of [Returning an array using C](https://stackoverflow.com/questions/11656532/returning-an-array-using-c) or [how to return a local array in c++](https://stackoverflow.com/questions/7769998/how-to-return-local-array-in-c) – dandan78 Sep 12 '19 at 14:34
  • @eerorika Returning the address is not UB. Using the result is UB. – n. m. could be an AI Sep 12 '19 at 14:56
  • Your compiler is likely gcc, not msvc – n. m. could be an AI Sep 12 '19 at 15:07
  • @n.m. Using is implementation defined. Except indirecting through the pointer, which is UB. – eerorika Sep 12 '19 at 15:32
  • @eerorika since implqementations don't bother to document their implementation-defined behaviour, the distinction is not that meaningful. – n. m. could be an AI Sep 12 '19 at 16:02
  • @n.m. Distinction is still meaningful, since UB has way wider ramifications. Implementation may assume that there is no UB, and optimise accordingly. Implementation may not assume that there is no implementation defined behaviour. Implementation defined sans the definition is same as unspecified behaviour as far as I can tell. – eerorika Sep 12 '19 at 16:05
  • @eerorika An implementation may define any implementation-defined behaviour as filling all variables with random bit patterns. Since they don't bother to document, I might just as well assume the actually do. Unspecified behaviour actually specifies a fixed set of possible behaviours ("it is unspecified whether X or Y") so it is not at all the same. – n. m. could be an AI Sep 12 '19 at 16:13
  • @n.m. Sure. That's a good idea. But there should be no need to plan for your escape path in case of nasal demons. – eerorika Sep 12 '19 at 16:14
  • @n.'pronouns'm. you're wrong, most of implementations have a document that regulates certain things. Most of what considered implementation defined goes to ABI spec, some actually might be defined by standards applied to OS\platform, etc. – Swift - Friday Pie Dec 16 '19 at 06:34
  • @Swift Can you find how and where "any use of an invalid pointer other than to perform indirection or deallocate", which is an implementation-defined behaviour, is actually defined by gcc or any other popular implementation? – n. m. could be an AI Dec 16 '19 at 09:45
  • @n. 'pronouns' Firstly, that's an UB, not an implementation-defined behavior. Second, ABI specification I mentioned are separate of compiler and are architecture or OS-wide (regardless compiler usually, though in past we had cases when different compilers were creating incompatible binaries). Thirdly, gcc is not a single compiler, it's a collection. There are several hundreds of implementations, for each platform and architecture. Care to specify which? – Swift - Friday Pie Dec 22 '19 at 09:53
  • @Swift-FridayPie I have quoted a sentence from the standard verbatim. You can see the full paragraph in the accepted answer. If you think it is wrong, you should author a defect report. Regarding gcc based implementations, select any one you like. – n. m. could be an AI Dec 22 '19 at 12:59

1 Answers1

1

The C++ standard says (quoting the latest draft):

[basic.stc]

When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values. Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.

p contains an invalid pointer value, and printing the value of the pointer is included in "any other use", and thus the behaviour is implementation defined. In the observed case, the behaviour was to output 0.

Note to readers that in the code in the example there is no indirection through the invalid pointer and the behaviour is not undefined.


P.S. Converting pointer to int is not correct. int isn't guaranteed to be sufficiently large to represent all pointer values, and on most 64 bit systems, it isn't sufficiently large. Standard only specifies the behaviour for conversion to sufficiently large integer type. I would suggest converting to void* instead for this case.

Community
  • 1
  • 1
eerorika
  • 232,697
  • 12
  • 197
  • 326
  • thank you , I inspect the assembly code , and see before return from function, the %eax is set to 0 .And I think this is implementation defined ,and your opinion that int may not be big enough to hold pointer , in this case ,it is ok ,as I know I will compile it to 32 bit code , but use void* and print it out as hex is better – wangz Sep 12 '19 at 23:43
  • @wangz In case you want to convert pointer to integer, use `std::intptr_t`. – eerorika Sep 12 '19 at 23:53