Applying FFT on multiplication of two very large number without using recursion

Question

I have recently learned the FFT algorithm.

I applied it to the problem of fast multiplication of very large natural number following this pseudocode,

Let A be array of length m, w be primitive m-th root of unity.
Goal: produce DFT F(A): evaluation of A at 1, w, w^2,...,w^{m-1}.

FFT(A, m, w)
{
  if (m==1) return vector (a_0)
  else {
    A_even = (a_0, a_2, ..., a_{m-2})
    A_odd  = (a_1, a_3, ..., a_{m-1})
    F_even = FFT(A_even, m/2, w^2)    //w^2 is a primitive m/2-th root of unity
    F_odd = FFT(A_odd, m/2, w^2)
    F = new vector of length m
    x = 1
    for (j=0; j < m/2; ++j) {
      F[j] = F_even[j] + x*F_odd[j]
      F[j+m/2] = F_even[j] - x*F_odd[j]
      x = x * w
  }
  return F
}

It works great but I found a better code that does the same job without recursion and also runs much faster.

I have tried to figure out how it works line by line, however, I failed.

I would really appreciate if you can explain me in detail what is happening those first two for loop (not the math part)

Below is the new code

typedef complex<double> base;

void fft(vector<base> &a, bool invert)
{
    int n = a.size();

    for (int i = 1, j = 0; i < n; i++){
        int bit = n >> 1;
        for (; j >= bit; bit >>= 1) j -= bit;
        j += bit;

        if (i < j) swap(a[i], a[j]);
    }
    for (int len = 2; len <= n; len <<= 1){

        double ang = 2 * M_PI / len * (invert ? -1 : 1);
        base wlen(cos(ang), sin(ang));

        for (int i = 0; i < n; i += len){
            base w(1);
            for (int j = 0; j < len / 2; j++){
                base u = a[i + j], v = a[i + j + len / 2] * w;
                a[i + j] = u + v;
                a[i + j + len / 2] = u - v;
                w *= wlen;
            }
        }
    }
    if (invert)
    {
        for (int i = 0; i < n; i++)
            a[i] /= n;
    }
}

if you want speed use NTT ... see [Modular arithmetics and NTT (finite field DFT) optimizations](https://stackoverflow.com/q/18577076/2521214) its also recursive but works on integers instead of on floats ... that is way faster and also more accurate as there is no rounding, conversions nor goniometrics involved. Also the speed difference between recursive and iteration approach might be just caused due to wrongly passing parameters in the recursive FFT ... in place is faster ... — Spektre, Jan 23 '19 at 08:05

score 0 · Accepted Answer · answered Jan 23 '19 at 04:27

Cooley–Tukey FFT implementation has beed described hundreds of times.

Wiki page part with non-recursive method.

The first loop is bit reversal part - code repacks source array, swapping element at i-th index with index of reversed bits of i (so for length=8 index 6=110b is swapped with index 3=011b, and index 5=101b remains at the same place).

This reordering allows to treat array in-place, making calculations on pairs, separated by 1,2,4,8... indexes (len/2 step here) with corresponding trigonometric coefficients.

P.S. Your answer contains onlinejudge tag, so such compact implementation is quite good for you purposes. But for real work it is worth to use some highly-optimized library like fftw etc

Applying FFT on multiplication of two very large number without using recursion

1 Answers1