1

I am building a voice authentication system and for that, I am using C# Speech recognition which lets me save the audio file which I convert and stores it as wav file.

I have another wav file in which I have stored my voice.

Then I am using FFT as mentioned here to compare 2 wav file and I use Cross Correlation code from here.

My openWav code is as below:

 public static void openWav(string filename, out double[] left, out double[] right)
        {
            var numArray = File.ReadAllBytes(filename);
            int num1 = numArray[22];
            int index1;
            int index2;
            int num2;
            for (index1 = 12;
                numArray[index1] != 100 || numArray[index1 + 1] != 97 ||
                (numArray[index1 + 2] != 116 || numArray[index1 + 3] != 97);
                index1 = index2 + (4 + num2))
            {
                index2 = index1 + 4;
                num2 = numArray[index2] + numArray[index2 + 1] * 256 + numArray[index2 + 2] * 65536 +
                       numArray[index2 + 3] * 16777216;
            }
            var index3 = index1 + 8;
            var length = (numArray.Length - index3) / 2;
            if (num1 == 2)
                length /= 2;
            left = new double[length];
            right = num1 != 2 ? null : new double[length];
            var index4 = 0;
            while (index3 < numArray.Length)
            {
                left[index4] = bytesToDouble(numArray[index3], numArray[index3 + 1]);
                index3 += 2;
                if (num1 == 2)
                {
                    right[index4] = bytesToDouble(numArray[index3], numArray[index3 + 1]);
                    index3 += 2;
                }
                ++index4;
            }
        }

It works without error but every time I get the answer in between 0.6 to 0.8 even though it is not my voice.

Can anyone suggest where I am doing wrong or if there is any other way to do it in C#?

Neel
  • 11,625
  • 3
  • 43
  • 61
  • 1
    I am pretty sure that cross correlation is not the correct approach for voice authentication. – Daniel Hilgarth Nov 05 '17 at 18:35
  • Then what is your suggestion? @DanielHilgarth – Neel Nov 05 '17 at 18:36
  • I have no idea, this is way out of my area of expertise. The thing is: Cross correlation is used to find one sample in another sample. Some variation is ok, but when you do authentication, you speak so it is different every time. – Daniel Hilgarth Nov 05 '17 at 18:42

0 Answers0