1

Consider such a C program

int fn(int n){
    int sum = 0;
    for(int i = 1; i <= n; i++)
        sum += i*i;
    return sum;
}

int main(){
    int n;
    scanf("%d",&n);
    printf("%d\n",fn(n));
    return 0;
}

In this program, I use a loop to calculate the faulhaber's formula.

In math, we know that for any k-power sum, we have a formula to calculate it in O(1) time.

However, compilers can find this is a faulhaber's formula and calculate it very efficently. The above program can be translated into IR with O2 like this

define dso_local i32 @fn(i32 %0) local_unnamed_addr #0 {
  %2 = icmp slt i32 %0, 1
  br i1 %2, label %22, label %3

3:                                                ; preds = %1
  %4 = add nsw i32 %0, -1
  %5 = zext i32 %4 to i33
  %6 = add nsw i32 %0, -2
  %7 = zext i32 %6 to i33
  %8 = mul i33 %5, %7
  %9 = add nsw i32 %0, -3
  %10 = zext i32 %9 to i33
  %11 = mul i33 %8, %10
  %12 = lshr i33 %11, 1
  %13 = trunc i33 %12 to i32
  %14 = mul i32 %13, 1431655766
  %15 = lshr i33 %8, 1
  %16 = trunc i33 %15 to i32
  %17 = mul i32 %16, 5
  %18 = add i32 %14, %17
  %19 = shl i32 %0, 2
  %20 = add i32 %18, %19
  %21 = add i32 %20, -3
  br label %22

22:                                               ; preds = %3, %1
  %23 = phi i32 [ 0, %1 ], [ %21, %3 ]
  ret i32 %23
}

The translated program costs nearly no time to calculate the sum, even though I use i*i*i or something like this, compiler can also do that.

I am interesting to the optimizations, but I don't know how Clang/gcc implement the above problem.

hnyls2002
  • 103
  • 8

0 Answers0