Single precision rfft

Question

I seek single precision rfft to accelerate computation; scipy.fftpack.rfft does this, but returns a real array that packs real and imaginary components in same axis, requiring a post-processing step. I implemented below to obtain the standard complex array, but Numpy's rfft ends up being faster for 2D inputs (but slower for 1D). Memory is also of concern, OOM with float64.

Does scipy or another library have a single precision rfft implementation that returns the standard complex array? (else, can below be done faster?)

import numpy as np
from numpy.fft import rfft
from scipy.fftpack import rfft as srfft

def rfft_sp(x):  # assumes len(x) is even
    xf = np.zeros((len(x)//2 + 1, x.shape[1]), dtype='complex64')
    h = srfft(x, axis=0)            
    xf[0] = h[0]
    xf[1:] = h[1::2]
    xf[:1].imag = 0
    xf[-1:].imag = 0
    xf[1:-1].imag = h[2::2]
    return xf

x = np.random.randn(500, 100000).astype('float32')

%timeit rfft_sp(x)
%timeit rfft(x, axis=0)

>>> 565 ms ± 15.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> 517 ms ± 22.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The numbers look quite comparable for `%timeit rfft_sp(x); %timeit rfft(x, axis=0)` — anon01, Feb 22 '21 at 08:52
Changing precision is not the first thing to address if you want better performance. I'd try 1) performance optimzed package (see here: https://stackoverflow.com/questions/6365623/improving-fft-performance-in-python/30109600) 2) consider MKL vs OpenBLAS backend, 2) JIT or hw acceleration (see here: https://stackoverflow.com/questions/55014239/how-to-do-100000-times-2d-fft-in-a-faster-way-using-python), 3) optimization within the package itself. All of these may get you orders of magnitude better performance than changing the precision — anon01, Feb 22 '21 at 20:54
@anon01 It won't solve memory problems; I could've been clearer on memory requirement, edited. — OverLordGoldDragon, Feb 23 '21 at 05:04

score 0 · Answer 1 · answered Feb 22 '21 at 11:20

on the machine on which I tested, using scipy.fft.rfft and casting to complex64 is faster than your implementation:

import numpy as np
from numpy.fft import rfft
from scipy.fft import rfft as srfft
from scipy.fftpack import rfft as srfft2

def rfft_sp(x):  # assumes len(x) is even
    xf = np.zeros((len(x)//2 + 1, x.shape[1]), dtype='complex64')
    h = srfft2(x, axis=0)            
    xf[0] = h[0]
    xf[1:] = h[1::2]
    xf[:1].imag = 0
    xf[-1:].imag = 0
    xf[1:-1].imag = h[2::2]
    return xf

def rfft_cast(x):  
    h = srfft(x, axis=0)            
    return h.astype('complex64')


x = np.random.randn(500, 100000).astype('float32')

%timeit rfft(x, axis = 0 ) 
%timeit rfft_sp(x ) 
%timeit rfft_cast(x)

produces:

1.81 s ± 144 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.89 s ± 7.58 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.24 s ± 9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The underlying implementation is still float64 which bears memory problems, but I did confirm `scipy.fft` to be faster than `np.fft`. — OverLordGoldDragon, Feb 22 '21 at 12:54

OverLordGoldDragon · Accepted Answer · 2021-07-04T20:35:06.063

0

scipy.fft works with single precision.

edited Jul 04 '21 at 20:35

answered Feb 22 '21 at 12:53

OverLordGoldDragon

1
9
53
101

Single precision rfft

2 Answers2