5

On the use and abuse of alloca

Got some benchmarks at the bottom of a previous question. clang clearly has a better implementation in the -O3 optimizer profile. What gives? Is clang cutting any corners? Also, since clang is a modern compiler, are there any safeties or other interesting properties in its alloca implementation?

Community
  • 1
  • 1
Hassan Syed
  • 20,075
  • 11
  • 87
  • 171
  • 4
    This is just a wild guess, but: If clang maps `alloca` to the LLVM `alloca` instruction, the (widely-used as it takes care of creating a well-formed SSA) `Reg2Mem` pass can transform it into using LLVM-level registers, which get all the optimizations LLVM provides for normal variables. –  Apr 28 '11 at 14:54

1 Answers1

4

Guess by delnan is true. But he didn't account that test is very bad, and clang can to optimize out actual alloca operation from alloca_test.

alloca_test have only llvm ir operation alloca, but no alloca() function call:

%11 = call i32 @_Z18random_string_sizev()
%12 = alloca i8, i32 %11

Compare with malloc_test:

%11 = call i32 @_Z18random_string_sizev()
%12 = call i8* @malloc(i32 %11)

Even with -O1 there is no more alloca in alloca_test:

define void @_Z11alloca_testv() nounwind {
; <label>:0
  %1 = tail call i32 @_Z18random_vector_sizev()
  %2 = icmp sgt i32 %1, 0
  br i1 %2, label %.lr.ph, label %._crit_edge

.lr.ph:                                           ; preds = %.lr.ph, %0
  %i.01 = phi i32 [ %4, %.lr.ph ], [ 0, %0 ]
  %3 = tail call i32 @_Z18random_string_sizev()
  %4 = add nsw i32 %i.01, 1
  %exitcond = icmp eq i32 %4, %1
  br i1 %exitcond, label %._crit_edge, label %.lr.ph

._crit_edge:                                      ; preds = %.lr.ph, %0
  ret void
}

And for malloc_test, malloc call is still here:

%6 = tail call i32 @_Z18random_string_sizev()
%7 = tail call i8* @malloc(i32 %6)

I should also say that g++ -O3 (tested 4.1 and 4.5.2) doesn't optimize out changing size of stack (alloca main effect).

osgx
  • 90,338
  • 53
  • 357
  • 513
  • well semantically the effect is the same no ? Inlining the alloca semantics vs using a llvm il call, the memory is still created on a stack frame ? – Hassan Syed Apr 28 '11 at 16:42
  • I don't know, if llvm allow to "call" alloca (how can called function change stack size of caller), but the alloca ir op is in basic ir operations set. http://llvm.org/docs/LangRef.html#i_alloca – osgx Apr 28 '11 at 16:49