0

I'm working a project about chord recognition. I'm using someone's journal as a reference but I still have little grasp in field of DSP. In her reference, first thing is I need to split the signal from wav file into number of frames. In my case, I need to split into 65 ms each frame, with 2866 sample per frame.

I have searched how to split signal into frames but I don't find them clear enough for me to understand. So far these are some of my codes in WavProcessing class:

 public void SetFileName(String fileNameWithPath) //called first in the form, to get the FileStream
    {
        _fileNameWithPath = fileNameWithPath;
        strm = File.OpenRead(_fileNameWithPath);

    }
 public double getLengthTime(uint wavSize, uint sampleRate, int bitRate, int channels)  
    {
        wavTimeLength = ((strm.Length - 44) / (sampleRate * (bitRate / 8))) / channels;
        return wavTimeLength;
    }

public int getNumberOfFrames() //return number of frames, I just divided total length time with interval time between frames. (in my case, 3000ms / 65 ms = 46 frames)
    { 
        numOfFrames = (int) (wavTimeLength * 1000 / _sampleFrameTime);
        return numOfFrames; 
    }

 public int getSamplePerFrame(UInt32 sampleRate, int sampleFrameTime) // return the sample per frame value (in my case, it's 2866)
    {
        _sampleRate = sampleRate;
        _sampleFrameTime = sampleFrameTime;

        sFr = (int)(sampleRate * (sampleFrameTime / 1000.0 ));

        return sFr; 
    }

I just still don't get the idea how to split the signal into 65 ms per frame in C#. Do I need to split the FileStream and break them into frames and save them into array? Or anything else?

Matthieu Brucher
  • 21,634
  • 7
  • 38
  • 62
Norman Pratama
  • 67
  • 2
  • 12
  • Typically you split signals in buckets with sizes with a base of 2. So 128,256,512,1024 ... because the most FFT algorithms need this. I have never dealt with chord recognition but I am pretty sure you need some kind of FFT or DCT to recognize pitches. Most code you'll find ( and there is a lot out there ) will be naked C. Using unsage code regions enable you to copy-paste them. You'll find a lot at www.musicdsp.org. As an ebook, I'd recommend http://dspguide.com/. It's brilliant! – guitarflow Feb 13 '12 at 00:57
  • This might also be helpful http://stackoverflow.com/questions/4033083/guitar-chord-recognition-algorithm – guitarflow Feb 13 '12 at 01:00
  • thank you for the information. I will look into it. I actually will use Enhanced Pitch Class Profile for the chord recognition part.. You said I split signals in base of 2, is there any code example? thx.. – Norman Pratama Feb 13 '12 at 01:09
  • There are tons of code examples!! But you have to understand what you are doing!! I am not aware of EPCP, just heard of it. Where is your information about 65 ms from? – guitarflow Feb 13 '12 at 01:30
  • More complete FFT libraries no longer require the length to be a power of 2, other small factors will work. Furthermore, you can sometimes zero pad a frame of data up to a more suitable FFT length, depending on what you need to do with the FFT result. – hotpaw2 Feb 13 '12 at 02:36

1 Answers1

1

with NAudio you would do it like this:

using (var reader = new AudioFileReader("myfile.wav"))
{
    float[] sampleBuffer = new float[2866];
    int samplesRead = reader.Read(sampleBuffer, 0, sampleBuffer.Length);
}

As others have commented, the number of samples you read ought to be a power of 2 if you plan to pass it into an FFT. Also, if the file is stereo, you will have left and right samples interleaved, so your FFT will need to be able to cope with this.

Mark Heath
  • 48,273
  • 29
  • 137
  • 194