I have a dataset (an array) and I need to find the periodicity in it. How should I proceed? Somebody said I can use FFT but I am not sure how will it give me the periodicity. Your help is appreciated!
5 Answers
For this task it's best to use the autocorrelation.
The FFT is the wrong tool to use for finding the periodicity.
Consider, for example, a case where your waveform is made by adding together two simple sine waves, one with a period of 2 seconds (0.5 Hz), and the other with 3 seconds (0.333 Hz). This waveform will have a periodicity of 6 seconds (i.e., 2*3), but the Fourier spectrum will only show two peaks at .5 Hz, and .333 Hz.

- 67,082
- 10
- 127
- 137
-
1How would you implement the task with autocorrelation? Could you give a sketch? – easytarget May 16 '14 at 21:41
-
3@MusséRedi: The idea is very simple: just take the autocorrelation and find the peak (that's not at 0). So then the only question is, how to do the autocorrelation and find the peak of the result. How you do this will depend on exactly what tools you are using; though you can do it all from scratch, most people would use some data analysis package. That is, my sketch here wouldn't be helpful, so instead, I'd suggest, pick an approach you like, try it, and if something doesn't work, ask a question with some specifics. – tom10 May 16 '14 at 22:12
-
I tried the method of the first answer to http://stackoverflow.com/questions/643699/how-can-i-use-numpy-correlate-to-do-autocorrelation to autocorrelate my data. This gives a descending sequence of numbers. And the peak value does not give any information on the periodicity. When testing on a sine function I get a descending oscillation. How should I find the periodicity? – easytarget May 16 '14 at 22:57
-
1@MusséRedi: I should have been more clear.. I meant create a new question on StackOverflow. Then you can show your code, figures, etc, with all the important details, and get a full answer as well. (For the sine wave though, the period will be the time of the first peak. It probably decays because of the details of your implementation, and isn't particularly meaningful, but that as it shifts the waveform more and more, there's less overlap. But really, just post a new question on SO.) – tom10 May 16 '14 at 23:22
-
I posted my question http://stackoverflow.com/questions/23706524/how-to-find-periodicity-with-autocorrelation-using-python could you have a look? – easytarget May 17 '14 at 00:57
Periodicity is not well defined term. For example, such data:
1, 10, 1, 10, 1, 11, 1, 10, 1, 10, 1, 11, 1, 10, 1, 10, 1, 11
you may treat as one with not exact but strong periodicity of 2, and as exact periodicity of 6.
For exact periodicity you may simply try to find given data as substring of data repeated twice.
For non exact periodicity of real, noisy signal time domain and frequency domain methods may be used.
Time domain one is self correlation. It is like a substring search above: searched for a shift value on which data have maximum self similarity.
For simple signals counting threshold transitions may be enough.
Frequecy domain methods include one using FFT/FHT: search for a maximum in fequency spectre which gives 1/T of periodicity.
Another method is using Cepstrum.

- 3,798
- 17
- 23
This new paper hasn't had a great deal of attention, spectral clustering
Amariei, C., Tomita, M., & Murray, D. B. (2014). Quantifying periodicity in omics data. Frontiers in cell and developmental biology.
Implemented in an R package available at oscillat.iab.keio.ac.jp. I'm not affiliated with the authors, but put the code up at GitHub here for easier access (main script here).
Uses a DFT and groups rows into major spectral powers, nice to use in my experience. Obviously for genomics it's designed to be robust (noted in the code it's computationally heavy), so may depend on the application.

- 5,226
- 5
- 36
- 66
I found a paper that combines an FFT-based periodogram with autocorrelation to provide more accurate information on the periodicity of a signal. I think that this method could be worth looking into:

- 1,667
- 19
- 30
-
The link seems dead, but I found something similar, using SVD to find the periodicity of the data: http://pre.aps.org/abstract/PRE/v59/i4/p4013_1 – Magsol Jun 11 '13 at 19:49
-
I just checked the link, and it seems to open up the PDF file for me. There may be other links you can try if you search for the title of the link on Google Scholar. – bnsmith Jul 19 '13 at 18:22
You could use FFT because it will convert your data set from a value-space to a frequency-space.
This means that you will end up having a set of frequencies that composed will produce the initial input that you want to analyze. Then you can easily recognize which are the major contribuitions that are generated by specific frequencies and so you will understand how many periodicities there are and which are the most influential ones..
take a look here: http://local.wasp.uwa.edu.au/~pbourke/miscellaneous/dft/

- 131,802
- 30
- 241
- 343