Calculate FFT from SampleGrabber using DirectShow .NET

Question

I'm developing a project using DirectShow .NET. I'm trying to integrate a library called "WPF Sound Visualization Library" which creates a spectrum analyzer visual.

In order for the visual to work I need to implement these 2 methods in my player:

GetFFTData(float[] fftDataBuffer) - Assigns current FFT data to a buffer.

Remarks: The FFT data in the buffer should consist only of the real number intensity values. This means that if your FFT algorithm returns complex numbers (as many do), you'd run an algorithm similar to: for(int i = 0; i < complexNumbers.Length / 2; i++) fftResult[i] = Math.Sqrt(complexNumbers[i].Real * complexNumbers[i].Real + complexNumbers[i].Imaginary * complexNumbers[i].Imaginary);
GetFFTFrequencyIndex(int frequency) - Gets the index in the FFT data buffer for a given frequency.

Edit: I already added the SampleGrabber and integrated it's callback with the GetFFTData (which is still not tested). But how do integrate the GetFFTFrequencyIndex method?

    protected int SampleCB(double SampleTime, IMediaSample pSample)
    {
        IntPtr pointer = IntPtr.Zero;
        pSample.GetPointer(out pointer);
        sampleDataBytes = new byte[pSample.GetSize()];
        Marshal.Copy(pointer, sampleDataBytes, 0, sampleDataBytes.Length);
        var sampleTime = SampleTime;
        var actualDataLength = pSample.GetActualDataLength();

        /* Release unmanaged resources */
        Marshal.ReleaseComObject(pSample);
        pSample = null;

        return (int)HResults.S_OK;
    }

    #region ISpectrumPlayer
    byte[] sampleDataBytes = null;

    public bool GetFFTData(float[] fftDataBuffer)
    {
        if (sampleDataBytes != null)
        {
            var sampleData = Utils.GetInt16Array(sampleDataBytes);
            double[] pRealIn = new double[sampleData.Length];

            for (var i = 0; i <= sampleData.Length - 1; i++)
                pRealIn[i] = sampleData[i];

            var pImagIn = new double[sampleDataBytes.Length];
            var pRealOut = new double[sampleDataBytes.Length];
            var pImagOut = new double[sampleDataBytes.Length];

            FFTUtils.Compute((uint) pRealIn.Length, pRealIn, pImagIn, pRealOut, pImagOut, false);

            fftDataBuffer = new float[sampleDataBytes.Length];

            for (int i = 0; i < pRealOut.Length; i++)
                fftDataBuffer[i] = (float) Math.Sqrt(pRealOut[i] * pRealOut[i] + pImagOut[i] * pImagOut[i]);
        }

        return true;
    }

    public int GetFFTFrequencyIndex(int frequency)
    {
        throw new NotImplementedException();
    }
    #endregion

I ודקג this class with methods that can help:

public class FFTUtils
{
    public const Double DDC_PI = 3.14159265358979323846;

    /// <summary>
    /// Verifies a number is a power of two
    /// </summary>
    /// <param name="x">Number to check</param>
    /// <returns>true if number is a power two (i.e.:1,2,4,8,16,...)</returns>
    public static Boolean IsPowerOfTwo(UInt32 x)
    {
        return ((x != 0) && (x & (x - 1)) == 0);
    }

    /// <summary>
    /// Get Next power of number.
    /// </summary>
    /// <param name="x">Number to check</param>
    /// <returns>A power of two number</returns>
    public static UInt32 NextPowerOfTwo(UInt32 x)
    {
        x = x - 1;
        x = x | (x >> 1);
        x = x | (x >> 2);
        x = x | (x >> 4);
        x = x | (x >> 8);
        x = x | (x >> 16);
        return x + 1;
    }

    /// <summary>
    /// Get Number of bits needed for a power of two
    /// </summary>
    /// <param name="PowerOfTwo">Power of two number</param>
    /// <returns>Number of bits</returns>
    public static UInt32 NumberOfBitsNeeded(UInt32 PowerOfTwo)
    {
        if (PowerOfTwo > 0)
        {
            for (UInt32 i = 0, mask = 1; ; i++, mask <<= 1)
            {
                if ((PowerOfTwo & mask) != 0)
                    return i;
            }
        }
        return 0; // error
    }

    /// <summary>
    /// Reverse bits
    /// </summary>
    /// <param name="index">Bits</param>
    /// <param name="NumBits">Number of bits to reverse</param>
    /// <returns>Reverse Bits</returns>
    public static UInt32 ReverseBits(UInt32 index, UInt32 NumBits)
    {
        UInt32 i, rev;

        for (i = rev = 0; i < NumBits; i++)
        {
            rev = (rev << 1) | (index & 1);
            index >>= 1;
        }

        return rev;
    }

    /// <summary>
    /// Return index to frequency based on number of samples
    /// </summary>
    /// <param name="Index">sample index</param>
    /// <param name="NumSamples">number of samples</param>
    /// <returns>Frequency index range</returns>
    public static Double IndexToFrequency(UInt32 Index, UInt32 NumSamples)
    {
        if (Index >= NumSamples)
            return 0.0;
        else if (Index <= NumSamples / 2)
            return (double)Index / (double)NumSamples;

        return -(double)(NumSamples - Index) / (double)NumSamples;
    }

    /// <summary>
    /// Compute FFT
    /// </summary>
    /// <param name="NumSamples">NumSamples Number of samples (must be power two)</param>
    /// <param name="pRealIn">Real samples</param>
    /// <param name="pImagIn">Imaginary (optional, may be null)</param>
    /// <param name="pRealOut">Real coefficient output</param>
    /// <param name="pImagOut">Imaginary coefficient output</param>
    /// <param name="bInverseTransform">bInverseTransform when true, compute Inverse FFT</param>
    public static void Compute(UInt32 NumSamples, Double[] pRealIn, Double[] pImagIn,
                        Double[] pRealOut, Double[] pImagOut, Boolean bInverseTransform)
    {
        UInt32 NumBits;    /* Number of bits needed to store indices */
        UInt32 i, j, k, n;
        UInt32 BlockSize, BlockEnd;

        double angle_numerator = 2.0 * DDC_PI;
        double tr, ti;     /* temp real, temp imaginary */

        if (pRealIn == null || pRealOut == null || pImagOut == null)
        {
            // error
            throw new ArgumentNullException("Null argument");
        }
        if (!IsPowerOfTwo(NumSamples))
        {
            // error
            throw new ArgumentException("Number of samples must be power of 2");
        }
        if (pRealIn.Length < NumSamples || (pImagIn != null && pImagIn.Length < NumSamples) ||
             pRealOut.Length < NumSamples || pImagOut.Length < NumSamples)
        {
            // error
            throw new ArgumentException("Invalid Array argument detected");
        }

        if (bInverseTransform)
            angle_numerator = -angle_numerator;

        NumBits = NumberOfBitsNeeded(NumSamples);

        /*
        **   Do simultaneous data copy and bit-reversal ordering into outputs...
        */
        for (i = 0; i < NumSamples; i++)
        {
            j = ReverseBits(i, NumBits);
            pRealOut[j] = pRealIn[i];
            pImagOut[j] = (double)((pImagIn == null) ? 0.0 : pImagIn[i]);
        }

        /*
        **   Do the FFT itself...
        */
        BlockEnd = 1;
        for (BlockSize = 2; BlockSize <= NumSamples; BlockSize <<= 1)
        {
            double delta_angle = angle_numerator / (double)BlockSize;
            double sm2 = Math.Sin(-2 * delta_angle);
            double sm1 = Math.Sin(-delta_angle);
            double cm2 = Math.Cos(-2 * delta_angle);
            double cm1 = Math.Cos(-delta_angle);
            double w = 2 * cm1;
            double ar0, ar1, ar2;
            double ai0, ai1, ai2;

            for (i = 0; i < NumSamples; i += BlockSize)
            {
                ar2 = cm2;
                ar1 = cm1;

                ai2 = sm2;
                ai1 = sm1;

                for (j = i, n = 0; n < BlockEnd; j++, n++)
                {
                    ar0 = w * ar1 - ar2;
                    ar2 = ar1;
                    ar1 = ar0;

                    ai0 = w * ai1 - ai2;
                    ai2 = ai1;
                    ai1 = ai0;

                    k = j + BlockEnd;
                    tr = ar0 * pRealOut[k] - ai0 * pImagOut[k];
                    ti = ar0 * pImagOut[k] + ai0 * pRealOut[k];

                    pRealOut[k] = (pRealOut[j] - tr);
                    pImagOut[k] = (pImagOut[j] - ti);

                    pRealOut[j] += (tr);
                    pImagOut[j] += (ti);
                }
            }

            BlockEnd = BlockSize;
        }

        /*
        **   Need to normalize if inverse transform...
        */
        if (bInverseTransform)
        {
            double denom = (double)(NumSamples);

            for (i = 0; i < NumSamples; i++)
            {
                pRealOut[i] /= denom;
                pImagOut[i] /= denom;
            }
        }
    }

    /// <summary>
    /// Calculate normal (power spectrum)
    /// </summary>
    /// <param name="NumSamples">Number of sample</param>
    /// <param name="pReal">Real coefficient buffer</param>
    /// <param name="pImag">Imaginary coefficient buffer</param>
    /// <param name="pAmpl">Working buffer to hold amplitude Xps(m) = | X(m)^2 | = Xreal(m)^2  + Ximag(m)^2</param>
    public static void Norm(UInt32 NumSamples, Double[] pReal, Double[] pImag, Double[] pAmpl)
    {
        if (pReal == null || pImag == null || pAmpl == null)
        {
            // error
            throw new ArgumentNullException("pReal,pImag,pAmpl");
        }
        if (pReal.Length < NumSamples || pImag.Length < NumSamples || pAmpl.Length < NumSamples)
        {
            // error
            throw new ArgumentException("Invalid Array argument detected");
        }

        // Calculate amplitude values in the buffer provided
        for (UInt32 i = 0; i < NumSamples; i++)
        {
            pAmpl[i] = pReal[i] * pReal[i] + pImag[i] * pImag[i];
        }
    }

    /// <summary>
    /// Find Peak frequency in Hz
    /// </summary>
    /// <param name="NumSamples">Number of samples</param>
    /// <param name="pAmpl">Current amplitude</param>
    /// <param name="samplingRate">Sampling rate in samples/second (Hz)</param>
    /// <param name="index">Frequency index</param>
    /// <returns>Peak frequency in Hz</returns>
    public static Double PeakFrequency(UInt32 NumSamples, Double[] pAmpl, Double samplingRate, ref UInt32 index)
    {
        UInt32 N = NumSamples >> 1;   // number of positive frequencies. (numSamples/2)

        if (pAmpl == null)
        {
            // error
            throw new ArgumentNullException("pAmpl");
        }
        if (pAmpl.Length < NumSamples)
        {
            // error
            throw new ArgumentException("Invalid Array argument detected");
        }

        double maxAmpl = -1.0;
        double peakFreq = -1.0;
        index = 0;

        for (UInt32 i = 0; i < N; i++)
        {
            if (pAmpl[i] > maxAmpl)
            {
                maxAmpl = (double)pAmpl[i];
                index = i;
                peakFreq = (double)(i);
            }
        }

        return samplingRate * peakFreq / (double)(NumSamples);
    }
}

Thank you so much!

The basic approach is to type "C# FFT" in google and pick one from the top 5 links. — H H, Aug 02 '14 at 12:19
I looked for FFT c# but I still don't know which parameters I should provide from the SampleCB. I'm having a hard time understanding the algorithm. — Shynet, Aug 02 '14 at 12:56
Thank you for your help. I'm not asking on explanation on how this algorithm works, I just want to know the right parameters to supply. Thats why I used the visualization library in the first place - so I won't get into how it works, at least not yet. I appreciate your help and effort - and believe me, I'm always trying to understand what kind of code I put in but this time I don't have much time to learn FFT from the inside. Just use an already built libraries to provide the functions I need. — Shynet, Aug 02 '14 at 13:23
I have added a helper class that contains FFT methods to work with. I'm guessing I need to use the Compute method and IndexToFrequency, but how? — Shynet, Aug 02 '14 at 16:05

NoOne · Answer 1 · 2014-08-02T18:03:11.007

0

If I remember well, the algorithm takes a real (e.g. int[n]) signal or a complex one (e.g. int[n][2]) and returns a complex FFT result.

So, it seems quite easy:

You take your input values (that you can plot as value-to-time in a chart, e.g. the left speaker audio values) and you feed them in the pRealIn parameter. In pImagIn you put zeros (as many as in pRealIn). In bInverseTransform you put false (of course).

Then you will take the result back in pRealOut & pImagOut. The result buffers should logically be of the same size as the input buffers. You must take those two output buffers and combine them in pair like so (for each element of the OUT arrays):

fftDataBuffer[k] = Math.Sqrt(pRealOut[k] * pRealOut[k] + pImagOut[k] * pImagOut[k]); // Do this from 1 to n

FFT result is an array of complex values (x = real part and y = imaginary part - you can depict it on a Cartesian system as a vector). You want the vector's size/amplitude that why you do the above.

That is for GetFFTData.

I see that you have a function called IndexToFrequency. So that may work. All you have to do is call this method for every index of your buffers. That is:

for(int i=0; i<n; i++) freq[i] = IndexToFrequency(i, n);

Keep those values stored and then in your GetFFTFrequencyIndex(int frequency) you find the closest match of the input parameter (frequency) to the elements in freq[n] and return its index.

I think that will be enough.

Important: Make sure that your buffers have a power-of-two size (NextPowerOfTwo seems to be designed to help you do just that).

If your data happens to be smaller some times, you can pad the values with zeros at the end (a.k.a. append zeros to your input buffers). Also: To get a better resolution you can again pad with zeros your data. This will increase the 'smoothness' of your result which may be desirable.

Where did you find this code, if I may ask (just curious :) )?

So, that's all it! :)

edited Aug 02 '14 at 18:03

answered Aug 02 '14 at 17:28

NoOne

3,851
1
40
47

The library is this: [Multimedia-PeakMeter-Control](http://www.codeproject.com/Articles/26357/Multimedia-PeakMeter-Control). Thank you for your help. If I understood you correctly I should use the sampleData I already have with the Compute method as is? or it should pass more filtering? – Shynet Aug 02 '14 at 18:29
No problem. Yes, if your data are audio samples (that seems to be the case) that's what you should do. :) – NoOne Aug 02 '14 at 18:36
Another thing - I see that the compute method only takes Double[] array and not byte[] array, that does not make sense as I only get an array of bytes. I cannot call this method as it will not except byte array. – Shynet Aug 02 '14 at 18:37
As far as I understand, your `sampleData` is an integer array in reality (audio samples are integers larger than 255). I don't know how easy it will be to copy the data in a int array. Of course the number of bits may vary from audio to audio and you will probably need to take this into account as well. Initially I would try to define `sampleData` as `var sampleData = new Int16[pSample.GetSize()/2];` Most audio streams use 16-bit values as far as I remember. Then you can convert them to `double`. This conversion may slow down your code though. – NoOne Aug 02 '14 at 18:49
If you manage to make it work, perhaps later you could make the input arrays to be `Int16[]` and the output arrays to be `float[]` (I don't think you need double precision). That MAY be a little faster. Also, I now see in the code that you can set `pImagIn` to null instead of zeroing it out (the lib creates the zeros itself internally). – NoOne Aug 02 '14 at 19:03
So I should pass an int16 array (converted to double) to the function? Does the IMediaSample interface has any method to pull which kind of stream is it. I mean which kind of interger the audio stream is using? – Shynet Aug 02 '14 at 19:06
I don't know much about IMediaSample, but there must be a way to get it somehow. What you want to get is called "audio bit depth". You may search the web for this. I have found a quote saying : " The media type information contains the resolution, bit-depth etc." http://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/f1820e51-cab7-415c-94e7-787e9b1d9910/how-to-control-the-deletion-of-an-imediasample – NoOne Aug 02 '14 at 19:16
Thank you so much for your help. I won't be home this weeke but I will work on this next weekend. Thank you so much for your help! stay tuned... – Shynet Aug 02 '14 at 23:48
@Shynet it might be easier to enforce the audioformat for the samplegrabber to the format you want. That way conversion will be done (automatically) in directshow. There is a [MEDIASUBTYPE_IEEE_FLOAT](http://msdn.microsoft.com/en-us/library/windows/desktop/dd317599(v=vs.85).aspx), but I have never used it. – wimh Aug 03 '14 at 16:24
@NoOne I have edited the question with an updated method. But still haven't understood how to implement the GetFFTFrequencyIndex. Can you please help me out with the new edited code? – Shynet Aug 07 '14 at 18:51
@Wimmel Thank you for your help, I will try it when the Visualizer will work with a certain media type and then try to it sort it out. – Shynet Aug 07 '14 at 18:52
So, does anyone have an idea? – Shynet Aug 09 '14 at 19:21
@Shynet Hi. Sorry for responding so late. Your `IndexToFrequency()` does the exact opposite. I believe all you have to do is build a map (look up table) with it, so that you can return the opposite value from `GetFFTFrequencyIndex()`. Also note that, you will probably get a symmetry on the FFT result. You only need the one half of it. – NoOne Aug 18 '14 at 17:32

Calculate FFT from SampleGrabber using DirectShow .NET

1 Answers1