I know generally speaking FFT and multiplication
is usually faster than direct convolve
operation, when the array is relatively large. However, I'm convolving a very long signal (say 10 million points) with a very short response (say 1 thousand points). In this case the fftconvolve
doesn't seem to make much sense, since it forces a FFT of the second array to the same size of the first array. Is it faster to just do direct convolve in this case?

- 11,670
- 22
- 61
- 79
-
2Is there a reason you can't just time both approaches, eg with `timeit`? – Danica Feb 22 '13 at 06:57
-
2I didn't know this function. I'll try. I also would like to know underlying theory though. – LWZ Feb 22 '13 at 07:22
-
Note: As of v0.19, convolve automatically chooses fftconvolve or the direct method based on an estimation of which is faster. – Dr Xorile Feb 28 '20 at 00:08
3 Answers
Take a look at the comparison I did here:
http://scipy-cookbook.readthedocs.io/items/ApplyFIRFilter.html
Your case might be near the transition between using a plain convolution and using the FFT-based convolution, so your best bet (as suggested by @Dougal in a comment) is to time it yourself.
(Note that I didn't do overlap-add or overlap-save in that comparison.)

- 110,654
- 19
- 194
- 214
-
the link seems to not pointing to what you meant (even though I could find it anyway in the page) – luca Jul 12 '16 at 12:34
-
@luca Thanks. That page was originally on the old scipy wiki, which is gone now. I've updated the link. – Warren Weckesser Jul 12 '16 at 14:25
thank you for your help. Now I did the test myself, I did convolution with 2 arrays, size of 2^20 and 2^4, and this is the result:
numpy.convolve: 110 ms
scipy.signal.convolve: 1.0 s
scipy.signal.fftconvolve: 2.5 s
So we have a winner, numpy convolve is is much faster than the others. I still don't know why though.
Now I tried 2 longer arrays, size of 2^22 and 2^10. The result is:
numpy.convolve: 6.7 s
scipy.signal.convolve: 221 s
scipy.signal.fftconvolve: MemoryError
The difference just gets bigger.

- 11,670
- 22
- 61
- 79
-
Thanks for the feedback. Maybe [related](https://github.com/scipy/scipy/issues/8154). – mins Apr 11 '21 at 16:08
FFT fast convolution via the overlap-add or overlap save algorithms can be done in limited memory by using an FFT that is only a small multiple (such as 2X) larger than the impulse response. It breaks the long FFT up into properly overlapped shorter but zero-padded FFTs.
Even with the overlap overhead, O(NlogN) will beat M*N in efficiency for large enough N and M.

- 70,107
- 14
- 90
- 153
-
Thanks for your answer! Do you mean even with the `fftconvolve` function, it will automatically break down long FFT into short FFTs and I do not need to worry about it? – LWZ Feb 22 '13 at 08:56
-
1@LWZ: scipy's fftconvolve does not do that, no. hotpaw, do you have a reference/implementation for that method? – endolith Aug 13 '13 at 18:06