What would be the best way to get Hz frequency value from audio stream(music) on iOS? What are the best and easiest frameworks provided by Apple to do that. Thanks in advance.
-
You need to be more specific - what sort of input are you looking at ? Speech ? Music ? A single instrument playing a single note ? A pure tone ? – Paul R Jul 27 '12 at 13:54
-
OK - so what kind of frequency information do you hope to extract ? Just a short term power spectrum, or something more sophisticated than that ? – Paul R Jul 27 '12 at 17:56
-
I need just Hz average value of every short music segment. Segment length is smaller than 0.2 s. – Olga Dalton Jul 27 '12 at 18:04
-
5There is no single "Hz value" - a complex sound like music contains energy at many different frequencies, and this distribution of energy versus frequency changes continuously. – Paul R Jul 27 '12 at 18:55
2 Answers
Here is some code I use to perform FFT in iOS using Accelerate Framework, which makes it quite fast.
//keep all internal stuff inside this struct
typedef struct FFTHelperRef {
FFTSetup fftSetup; // Accelerate opaque type that contains setup information for a given FFT transform.
COMPLEX_SPLIT complexA; // Accelerate type for complex number
Float32 *outFFTData; // Your fft output data
Float32 *invertedCheckData; // This thing is to verify correctness of output. Compare it with input.
} FFTHelperRef;
//first - initialize your FFTHelperRef with this function.
FFTHelperRef * FFTHelperCreate(long numberOfSamples) {
FFTHelperRef *helperRef = (FFTHelperRef*) malloc(sizeof(FFTHelperRef));
vDSP_Length log2n = log2f(numberOfSamples);
helperRef->fftSetup = vDSP_create_fftsetup(log2n, FFT_RADIX2);
int nOver2 = numberOfSamples/2;
helperRef->complexA.realp = (Float32*) malloc(nOver2*sizeof(Float32) );
helperRef->complexA.imagp = (Float32*) malloc(nOver2*sizeof(Float32) );
helperRef->outFFTData = (Float32 *) malloc(nOver2*sizeof(Float32) );
memset(helperRef->outFFTData, 0, nOver2*sizeof(Float32) );
helperRef->invertedCheckData = (Float32*) malloc(numberOfSamples*sizeof(Float32) );
return helperRef;
}
//pass initialized FFTHelperRef, data and data size here. Return FFT data with numSamples/2 size.
Float32 * computeFFT(FFTHelperRef *fftHelperRef, Float32 *timeDomainData, long numSamples) {
vDSP_Length log2n = log2f(numSamples);
Float32 mFFTNormFactor = 1.0/(2*numSamples);
//Convert float array of reals samples to COMPLEX_SPLIT array A
vDSP_ctoz((COMPLEX*)timeDomainData, 2, &(fftHelperRef->complexA), 1, numSamples/2);
//Perform FFT using fftSetup and A
//Results are returned in A
vDSP_fft_zrip(fftHelperRef->fftSetup, &(fftHelperRef->complexA), 1, log2n, FFT_FORWARD);
//scale fft
vDSP_vsmul(fftHelperRef->complexA.realp, 1, &mFFTNormFactor, fftHelperRef->complexA.realp, 1, numSamples/2);
vDSP_vsmul(fftHelperRef->complexA.imagp, 1, &mFFTNormFactor, fftHelperRef->complexA.imagp, 1, numSamples/2);
vDSP_zvmags(&(fftHelperRef->complexA), 1, fftHelperRef->outFFTData, 1, numSamples/2);
//to check everything =============================
vDSP_fft_zrip(fftHelperRef->fftSetup, &(fftHelperRef->complexA), 1, log2n, FFT_INVERSE);
vDSP_ztoc( &(fftHelperRef->complexA), 1, (COMPLEX *) fftHelperRef->invertedCheckData , 2, numSamples/2);
//=================================================
return fftHelperRef->outFFTData;
}
Use it like this:
Initialize it: FFTHelperCreate(TimeDomainDataLenght);
Pass Float32 time domain data, get frequency domain data on return: Float32 *fftData = computeFFT(fftHelper, buffer, frameSize);
Now you have an array where indexes=frequencies, values=magnitude (squared magnitudes?). According to Nyquist theorem your maximum possible frequency in that array is half of your sample rate. That is if your sample rate = 44100, maximum frequency you can encode is 22050 Hz.
So go find that Nyquist max frequency for your sample rate: const Float32 NyquistMaxFreq = SAMPLE_RATE/2.0;
Finding Hz is easy: Float32 hz = ((Float32)someIndex / (Float32)fftDataSize) * NyquistMaxFreq; (fftDataSize = frameSize/2.0)
This works for me. If I generate specific frequency in Audacity and play it - this code detects the right one (the strongest one, you also need to find max in fftData to do this).
(there's still a little mismatch in about 1-2%. not sure why this happens. If someone can explain me why - that would be much appreciated.)
EDIT:
That mismatch happens because pieces I use to FFT are too small. Using larger chunks of time domain data (16384 frames) solves the problem. This questions explains it: Unable to get correct frequency value on iphone
EDIT: Here is the example project: https://github.com/krafter/DetectingAudioFrequency
-
Amazing... On my iPhone 5 it peaks at 19K Hz on this site: http://www.audionotch.com/app/tune/. – Morkrom Jan 06 '15 at 22:51
-
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/136235/discussion-between-dreambegin-and-krafter). – tryKuldeepTanwar Feb 21 '17 at 11:07
-
Its taking 2-3 second to update the label. How i can get result very fast? – Kishore Suthar May 08 '18 at 11:27
-
1@suthar You can use smaller values in accumulatorDataLenght. Keep in mind that the smaller the value is the less accurate the frequency is. – krafter May 08 '18 at 12:11
Apple does not provide a framework for frequency or pitch estimation. However, the iOS Accelerate framework does include routines for FFT and autocorrelation which can be used as components of more sophisticated frequency and pitch recognition or estimation algorithms.
There is no way that is both easy and best, except possibly for a single long continuous constant frequency pure sinusoidal tone in almost zero noise, where an interpolated magnitude peak of a long windowed FFT might be suitable. For voice and music, that simple method will very often not work at all. But a search for pitch detection or estimation methods will turn up lots of research papers on more suitable algorithms.

- 70,107
- 14
- 90
- 153