I know that FFT function cannot work with NAN values. We used either inerpolation to get ride of those NAN or we replace the NAN with zeros. Yet, I am wondering why FFT cannot work with NAN?. FFT basically makes a summation of the time series after multiply with different harmonics, and most libraries have function that perform summation while skipping NAN values. Thanks
-
3The opposite. Most arithmetic libraries propagate NaNs rather than eliminate them. An orthogonal transform will propagate a single one to the entire result. – hotpaw2 Jan 17 '18 at 16:16
-
1What hotpaw2 said. But also: you describe a naive implementation of the DFT, which is not what the FFT is. The FFT is an efficient way to compute the DFT, and does a quite different thing from wat you describe. – Cris Luengo Jan 17 '18 at 22:15
1 Answers
Recent versions of langages and compilers (C, C++) and subsecant compiled libraries (such as FFTW) performing floating point computations rely on the IEC 60559 or IEEE 754 standards for floating point arithmetics, using types like float
or double
, unless flags like ggc's -ffast-math
are activated. These standards decribe the expected behavior of float
and double
: most SIMD instructions are expected to comply with these standards (See this course and Checking if a double (or float) is NaN in C++ ) . Therefore, libraries (Blas Lapack) using SIMD instructions likely behave in the same way.
A sentense about NaN in "What Every Computer Scientist Should Know About Floating-Point Arithmetic " by David Goldberg
Although the IEEE standard defines the basic floating-point operations to return a NaN if any operand is a NaN, this might not always be the best definition for compound operation...
A few words about the operations on NaN in IEEE 754, in 1985:
Every operation involving a signaling NaN or invalid operation (7.1) shall, if no trap occurs and if a floating-point result is to be delivered, deliver a quiet NaN as its result.
Hence, for a Discrete Fourier Transform, a single NaN in the input likely contaminates the entire output array. There might be exeptions to this behavior due to rules like this one proposed in a draft:
A complex or imaginary value with at least one infinite part is regarded as an infinity (even if its other part is a NaN).
There are reasons for not overwritting NaN by a default value:
- NaN signals and error, or a special case which must be handle with care by the developper of the application. Getting a NaN often signals that something went wrong due to wrong initialization or memory managment. Investigating, understanding, preventing and handling these errors often make the code safer and more reliable than overwritting the input or the output by a default value.
- Skipping NaN values can be sensible, but it requires some sort of rationale. More specifically, the average of a dicretized signal (i. e. frequency 0 of the DFT) featuring NaN values can be computed by ignoring NaN values, but the number of non-NaN values must be computed along to recover a non-biased estimate of the average. Depending on your signal, using a linear interpolation can make sense. But if the signal is periodic and if the period is known, other solutions such as Trigonometric interpolation could make more sense depending on the physics of your problem. Hence, different users of the library could implement different way to handle NaN...
- The names of the libraries sometimes speak for themselves : "Fastest Fourier Transform in the West". Indeed, the wall clock time remains an important factor for efficient applications. Handling NaN values by doing something different from the SIMD instructions would likely require more tests and might slightly slow down the computation, even if there is no NaN in the array, and even if the SIMD instructions are called in the end.
Additionnal sources on NaN:

- 9,525
- 2
- 25
- 41
-
So according to the IEEE standard, it is more carful to chose not to skip NAN and propagate them inside the calculations. I found in Python that there is two version of the some of the basic mathematical function, one to deal with the ordinary numbers and the other to deal with those arrays containing NAN values. For example, there is np.sum() and np.nansum(). I hope that in the future, most mathematical function is written in two versions, one to propagate NAN, and the other to skip it. The user should be responsible for his/her choice. Thanks – Kernel Jan 18 '18 at 03:54
-
My data contain a lot of NAN values, so I think that I need to write my own FFT function that deals with those NAN. Thanks again for all those information. – Kernel Jan 18 '18 at 03:55