1

I've got this code, but it keeps returning random frequencies from 0 to about 1050. Please can you help me understand why this is happening.

My data length is 1024, sample rate is 8192, and data is a short array filled with input data from the mic.


float *iSignal = new float[2048];
float *oSignal = new float[2048];
int pitch = 0;

for(x=0;x<=1024;x++) {
    iSignal[x] = data[x];
}

fft(iSignal,oSignal,1024); //Input data, output data, length of input and output data

for(int y=0;y< 2048;y+=2) {
if((pow(oSignal[y],2)+pow(oSignal[y+1],2))>(pow(oSignal[pitch],2)+pow(oSignal[(pitch)+1],2))) {
        pitch = y;
    }
}

double pitchF = pitch / (8192.0/1024);
printf("Pitch: %f\n",pitchF);

Thanks,

Niall.

Edit: Changed the code, but it's still returning random frequencies.

Niall
  • 757
  • 3
  • 9
  • 17
  • What are you feeding the mic with? As I said, unless it's a very pure tone, you're bound to get random results. – avakar Aug 29 '09 at 13:29
  • maybe optimize slightly your code with caching the result value of pow(oSignal[pitch], 2) + pow(oSignal[pitch + 1], 2), because in your code it is computed multiple times with the same value of pitch. – moala Aug 29 '09 at 13:32
  • My voice, and some guitar noises. Could you suggest a way to make it not return random results? Would windowing work? – Niall Aug 29 '09 at 13:33
  • First, what is the purpose of your code? – moala Aug 29 '09 at 13:34
  • To detect a users singing pitch – Niall Aug 29 '09 at 13:37
  • There is another technique for that which uses a sliding window: you pick a window, offset it somewhat and compare it to the signal. If they are very similar, then the offset corresponds to the period of the signal. You try several (actually a log of) offsets and pick the one that results in greatest similarity. – avakar Aug 29 '09 at 13:43
  • @Niall: you are trying to achieve "pitch detection". I think the algorithm for that is not as simple as that. Check literature on that keyword. – moala Aug 29 '09 at 13:54

4 Answers4

7

Assuming oSignal is filled with complex numbers in such a way, that real and imaginary parts alternate, it might help to change

for(int y=0;y< 8191;y++)

to

for(int y=0;y< 8191;y+=2)

Edit: I didn't even notice that you're passing only 1024 samples. You must pass as many time-domain samples as there will be frequency-domain samples, in your case 4096.

Edit: One more thing: you're obviously trying to find the base frequency of something. Unless that something is a computer generated tone or a human whistle (both of which are very pure tones), you might be disappointed by the result. The simple method you posted barely works for flute.

Edit: For voice and guitar you're out of luck. I wrote a program some time ago that displays the frequency domain, try it out, you'll see the problem. There are also sources available, if you're interested.

Final edit: You might want to read the Wikipedia article on pitch detection. Concentrate on time-domain approaches.

avakar
  • 32,009
  • 9
  • 68
  • 103
  • 1
    To second what avakar said, detecting the pitch of an audio recording accurately enough for musical purposes (i.e. to within 1/100 of a semi-tone) is essentially impossible with FFT, because the frequency resolution obtained is proportional to the FFT windows size. Auto-correlation is a more appropriate technique for this purpose. – MusiGenesis Aug 29 '09 at 14:18
1

It seems iSignal[1025]..iSignal[8191] contain random data. You could try to set it to 0. But why do you pass 8192 to fft() if your data length is 1024 (or is it 1025)?

Also, you loose some precision in the integer division. Change it to double pitchF = pitch / (8192.0/1024);

Does your fft function expect real or complex input data? In case it expects complex data, you have to set every other entry of iSignal to 0.

Henrik
  • 23,186
  • 6
  • 42
  • 92
0

"random frequencies from 0 to about 1050" - doesn't the typical audio signal consist of a combination of frequencies? Since your sample rate is 8192 Hz, your FFT can detect up to 8192/2 = 4096 Hz. I would expect that you'd see a combination of many frequencies, but I wouldn't call them "random".

Why are you surprised? What did I miss?

duffymo
  • 305,152
  • 44
  • 369
  • 561
0

Two things:

  • Are you sure you're using your fft function correctly? You treat the output as if it is a complex array organized [R_1 I_1 R_2 I_2 ...], but you treat the input array as if it is Organized [R_1 R_2 R_3 ... R_1024 I_1 I_2 ...] and as Henrik says then leave the complex parts uninitialized.
  • Your peak detection is extremely primitive, though it should do for simple input (like a single guitar sting). For use with a human voice, you almost certainly want a more sophisticated approach.

Have you tried putting a known simple (i.e. pure sine) signal as input?

Community
  • 1
  • 1
dmckee --- ex-moderator kitten
  • 98,632
  • 24
  • 142
  • 234