4

What is the default call convention that the clang compiler uses? I noticed that when I return a local pointer, the reference is not lost

#include <stdio.h>

char *retx(void) {
      char buf[4] = "buf";
      return buf;
}

int main(void) {
    char *p1 = retx();
    puts(p1);
    return 0;
}
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Yuri Albuquerque
  • 474
  • 3
  • 14
  • 6
    Undefined behaviour sometimes happens to work. The calling convention depends on the target. (e.g. x86-64 System V, Windows x64, i386 System V for 32-bit code on Linux, AArch64's standard convention, PowerPC's standard convention, etc. etc.) – Peter Cordes Jun 22 '19 at 20:04
  • I have tested with clang several times and every time I displayed the string, strange that behavior, anyway grateful. – Yuri Albuquerque Jun 22 '19 at 20:09
  • 1
    it also happens on other compilers. Doesn't mean you can use it.. clang probably warns you about this anyway – Jean-François Fabre Jun 22 '19 at 20:10
  • 1
    what target? what compiler options? – old_timer Jun 22 '19 at 20:11
  • 3
    @YuriAlbuquerque: the "decision" / "luck" of whether it "works" or not is made at compile time, not runtime. Compiling / running the same source multiple times with the same compiler tells you nothing. – Peter Cordes Jun 22 '19 at 20:11
  • 2
    To test it, put another subroutine call (with some local vars) between `char *p1 = retx();` and `puts(p1);` and see if the local variable space will be overwritten, because the local var was saved on the stack. Easy check. – zx485 Jun 22 '19 at 20:17
  • @zx485 I did this, put several variables to be allocated on the stack, even so the reference was not lost. – Yuri Albuquerque Jun 22 '19 at 20:20
  • Anyway, now that I know the behavior is undefined, I see it is a matter of luck, thank you all for your attention. – Yuri Albuquerque Jun 22 '19 at 20:25

2 Answers2

4

This is Undefined Behaviour. It might happen to work, or it might not, depending on what the compiler happened to choose when compiling for some specific target. It's literally undefined, not "guaranteed to break"; that's the entire point. Compilers can just completely ignore the possibility of UB when generating code, not using extra instructions to make sure UB breaks. (If you want that, compile with -fsanitize=undefined).

Understanding exactly what happened requires looking at the asm, not just trying running it.

warning: address of stack memory associated with local variable 'buf' returned [-Wreturn-stack-address]
      return buf;
             ^~~

Clang prints this warning even without -Wall enabled. Exactly because it's not legal C, regardless of what asm calling convention you're targeting.


Clang uses the C calling convention of the target it's compiling for1. Different OSes on the same ISA can have different conventions, although outside of x86 most ISAs only have one major calling convention. x86 has been around so long that the original calling conventions (stack args with no register args) were inefficient so various 32-bit conventions evolved. And Microsoft chose a different 64-bit convention from everyone else. So there's x86-64 System V, Windows x64, i386 System V for 32-bit x86, AArch64's standard convention, PowerPC's standard convention, etc. etc.


I have tested with clang several times and every time I displayed the string

The "decision" / "luck" of whether it "works" or not is made at compile time, not runtime. Compiling / running the same source multiple times with the same compiler tells you nothing.

Look at the generated asm to find out where char buf[4] ends up.


My guess: maybe you're on Windows x64. Happening to work is more plausible there than most calling conventions, where you'd expect buf[4] to end up below the stack pointer in main, so the call to puts, and puts itself, would be very likely to overwrite it.

If you're on Windows x64 compiling with optimization disabled, retx()'s local char buf[4] might be placed in the shadow space it owns. The caller then calls puts() with the same stack alignment, so retx's shadow space becomes puts's shadow space.

And if puts happens not to write its shadow space, then the data in memory that retx stored is still there. e.g. maybe puts is a wrapper function that in turn calls another function, without initializing a bunch of locals for itself first. But not a tailcall, so it allocates new shadow space.

(But that's not what clang8.0 does in practice with optimization disabled. It looks like buf[4] will be placed below RSP and get stepped on there, using __attribute__((ms_abi)) to get Windows x64 code-gen from Linux clang: https://godbolt.org/z/2VszYg)

But it's also possible in stack-args conventions where padding is left to align the stack pointer by 16 before a call. (e.g. modern i386 System V on Linux for 32-bit x86). puts() has an arg but retx() doesn't, so maybe buf[4] ended up in memory that the caller "allocates" as padding before pushing a pointer arg for puts.

Of course that would be unsafe because the data would be temporarily below the stack pointer, in a calling convention with no red-zone. (Only a few ABIs / calling conventions have red zones: memory below the stack pointer that's guaranteed not to be clobbered asynchronously by signal handlers, exception handlers, or debuggers calling functions in the target process.)


I wondered if enabling optimization would make it inline and happen to work. But no, I tested that for Windows x64: https://godbolt.org/z/k3xGe4. clang and MSVC both optimize away any stores of "buf\0" into memory. Instead they just pass puts a pointer to some uninitialized stack memory.

Code that breaks with optimization enabled is almost always UB.


Footnote 1: Except for x86-64 System V, where clang uses an extra un-documented "feature" of the calling convention: Narrow integer types as function args in registers are assumed to be sign-extended to 32 bits. gcc and clang both do this when calling, but ICC does not, so calling clang functions from ICC-compiled code can cause breakage. See Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • By the way I'm using linux and the target is for x86-64 architecture – Yuri Albuquerque Jun 22 '19 at 20:40
  • 1
    @YuriAlbuquerque: Just random luck that `call puts` doesn't happen to overwrite `buf` on the stack 12 to 9 bytes below its return address. Maybe a wrapper function does one push and then reserves more padding. https://godbolt.org/z/94l245 – Peter Cordes Jun 22 '19 at 20:50
  • 2
    `-pedantic`: returning a local address is legal in C, as the pointer is still legal when the return statement assigns the return value. The UB comes from trying to use the return value in any way. It is possible to write a correct program that returns a local pointer, not that it would be useful in any way. – Antti Haapala -- Слава Україні Jun 23 '19 at 02:50
1

Annex L of the C11 Draft N1570 recognizes some situations (i.e. "non-critical Undefined Behavior") where the Standard imposes no particular behavioral requirements but implementations that define __STDC_ANALYZABLE__ with a non-zero value should offer some guarantees, and other situations ("critical Undefined Behavior") where it would be common for implementations not to guarantee anything. Attempts to access objects past their lifetime would fall into the latter category.

While nothing would prevent an implementation from offering behavioral guarantees beyond what the Standard requires, even for Critical Undefined Behavior, and some tasks would require that implementations do so (e.g. many embedded systems tasks require that programs dereference pointers to addresses whose targets no not satisfy the definition for "objects"), accessing automatic variables past their lifetime is a behavior about which few implementations would offer any guarantees beyond perhaps guaranteeing that reading an arbitrary RAM address will have no side-effects beyond yielding an Unspecified value.

Even implementations that guaranteed how automatic objects will be laid out on the stack seldom guaranteed that the storage that held them wouldn't be overwritten between the time a function returned and the next action by the caller. Unless interrupts were disabled, interrupt handling could overwrite any storage used that had been used by automatic objects that were no longer in a live stack frame.

While many implementations can be configured to offer useful guarantees about the behavior of actions for which the Standard imposes no requirements, I can't think of any implementations that can be configured to offer sufficient guarantees to make the above code usable.

supercat
  • 77,689
  • 9
  • 166
  • 211