How to understand the result of Discrete Fourier Transform under period finding?

Question

I am learning how to use Discrete Fourier Transform(DFT) to find the period about a^x mod(N), in which x is a positive integer, a is any prime number, and N is the product of two prime factors p and q.

For example, the period of 2^x mod(15) is 4,

>>> for x in range(8):
...     print(2**x % 15)
...

Output: 1 2 4 8 1 2 4 8
                ^-- the next period

and the result of DFT is as following,

(cited from O'Reilly Programming Quantum Computers chapter 12)

There are 4 spikes with 4-unit spacing, and I think the latter 4 means that period is 4.

But, when N is 35 and period is 12

>>> for x in range(16):
...     print(2**x % 35)
...

Output: 1 2 4 8 16 32 29 23 11 22 9 18 1 2 4 8
                                       ^-- the next period

In this case, there are 8 spikes greater than 100, whose locations are 0, 5, 6, 11, 32, 53, 58, 59, respectively.

Does the location sequence imply the magic number 12? And how to understand "12 evenly spaced spikes" from the righthand graph?

(cited from O'Reilly Programming Quantum Computers chapter 12)

Spektre · Answer 1 · 2021-05-19T07:21:45.620

see How to compute Discrete Fourier Transform? and all the sublinks especially How do I obtain the frequencies of each value in an FFT?.

As you can see i-th element of DFT result (counting from 0 to n-1 including) represent Niquist frequency

f(i) = i * fsampling / n

And DFT result uses only those sinusoidal frequencies. So if your signal does have different one (even slightly different frequency or shape) aliasing occurs.

Aliased sinusoid creates 2 frequencies in DFT output one higher and one lower frequency.

Any sharp edge is translated to many frequencies (usually continuous spectrum like your last example)

The f(0) is no frequency and represents DC offset.

On top of all this if the input of your DFT is real domain then the DFT result is symmetric meaning you can use only first half of the result as the second is just mirror image (not including the f(0)) which makes sense as you can not represent bigger than fsampling/2 frequency in real domain data.

Conclusion:

You can not obtain frequency of signal used by DFT as there is infinite number of ways how such signal can be computed. DFT is reconstructing the signal using sinwaves and your signal is definately no sinwave so the results will not match what you think.

Matching niquist frequencies to yours is done by correctly chosing the n for DFT however without knowing the frequency ahead you can not do this ...

It may be possible to compute the singular sinwave frequency from its 2 aliases however your signal is no sinwave so that is not applicable for your case anyway.

I would use different approaches to determine frequency of integer numeric signal:

compute histogram of signal

so count how many of each number there is
test possible frequencies

You can brute force all possible periods of signal and test if consequent periods are the same however for big data is this not optimal...

We can use histogram to speed this up. So if you look at the counts cnt(ix) from histogram for periodic signal of frequency f and period T in data of size n then the period of signal should be a common divider of all the counts
```
T = n/f
k*f = GCD(all non zero cnt[i])
```
where k divides the GCD result. However in case n is not exact multiple of T or the signal has noise or slight deviations in it this will not work. However we can at least estimate the GCD and test all frequencies around which will be still faster than brute force.

So for each count (not accounting for noise) it should comply this:
```
cnt(ix) = ~ n/(f*k)
k = { 1,2,3,4,...,n/f} 
```
so:
```
f = ~ n/(cnt(ix)*k)
```
so if you got signal like this:
```
1,1,1,2,2,2,2,3,3,1,1,1,2,2,2,2,3,3,1
```
then histogram would be cnt[]={0,7,8,4,0,0,0,0,...} and n=19 so computing f in periods per n for each used element leads to:
```
f(ix) = n/(cnt(ix)*k) 
f(1)  = 19/(7*k) = ~ 2.714/k
f(2)  = 19/(8*k) = ~ 2.375/k
f(3)  = 19/(4*k) = ~ 4.750/k
```
Now the real frequency should be a common divider (CD) of results so taking biggest and smallest counts rounded up and down (ignoring noise) leads to these options:
```
f = CD(2,4) = 2
f = CD(3,4) = none
f = CD(2,5) = none
f = CD(3,5) = none
```
so now test frequency (luckily its just one valid in this case) 2 periods per 19 samples meaning T = ~ 9.5 so test rounded up and down ...
```
signal(t+ 0)=1,1,1,2,2,2,2,3,3,1,1,1,2,2,2,2,3,3,1
signal(t+ 9)=1,1,1,2,2,2,2,3,3,1 // check 9 elements
signal(t+10)=1,1,2,2,2,2,3,3,1,?   // check 10 elements
```
As you can see signal(t...t+9)==signal(t+9...t+9+9) meaning the period is T=9.

How to understand the result of Discrete Fourier Transform under period finding?

1 Answers1