I have been trying to implement an autocorrelation algorithm on the iPhone using the vDSP libraries supplied by Apple (Accelerate framework).
So far i created an audio unit following the auriotouch example of apple, but i want to use the accelerate framework for performing the autocorrelation instead of the older implementation in the example code of auriotouch.
The IORemote audio unit is routed trough my renderCallBack method like so:
{
AudioGraphController *controller = (AudioGraphController *) inRefCon;
// Remove DC component
for(UInt32 i = 0; i < ioData->mNumberBuffers; ++i)
controller.dcFilter[i].InplaceFilter((SInt32*)(ioData->mBuffers[i].mData), inNumberFrames, 1);
OSStatus result = AudioUnitRender(controller.inputUnit, ioActionFlags, inTimeStamp, 1, inNumberFrames, ioData);
if (result) { printf("InputRenderCallback: error %d\n", (int)result); return result; }
controller.manager->ProcessAudioData(inNumberFrames, ioData);
return noErr;
}
The input data from the microphone is sent to a ProcessAudioData method which performs the autocorrelation, according the the c++ snippet in this post: Using the Apple FFT and Accelerate Framework
However i have some trouble understanding the information in the displaydata array.
When i try to access the information all i get is nan, the only time i get an idea of the information is when i cast the displaydata array like so:
SInt16* buf = (SInt16 *)displayData;
The steps to compute the autocorrelation i follow these steps: - Split the real input (ioData->mBuffers[0].mData) to even and odd inputs. - Perform a FFT (forward) - Take the absolute square of the values generated by the FFT. - Take the IFFT (backward/inverse) - Convert the complex split to real.
Could someone give me some pointers/advice as how to interpret the information in the displaydata array, also when i do examine display data like this they seem to be all the same values, altho they do vary depending on the mic input.
The input of the microphone is expected to be a signal with some echoes of the original signal, the goal of the autocorrelation is to determine the lag where the autocorrelation peaks, so that i can determine the offset of the echo to the original signal.
Should i create a echoed version of the signal first (with some offset) and use that in the to multiple the values of the FFT?
I appreciate any input, also if you can guide me to information which explains this more clearly, since i fairly new with vDSP techniques especially on the iPhone. I do have mathematical experience with convolution and fourier transforms, but the in place packing of Apple is having me guessing where i can find the information i expect to get from this calculation.