Matching two audio files

Question

I want to record a dog bark, save the file and compare with several files containing different types of bark (warning bark, crying bark, etc..).

How could i do that comparison in order to get a match? What is the process to follow in this type of apps?

Thank you for the tips.

The best bet, I think, would be spectral analysts. Use an FFT (Fast Fourier Transform) to get the spectrum of the bark and compare the spectrum's. You might be able to define some filters to help the analysts. Have fun. — cliff2310, Jul 28 '12 at 22:53
THank you @clif2310. is there another way to achieve it? Using FFT would mean a lot of time studying and implementing because its quite complex. — pindleskin, Jul 28 '12 at 23:00
FFT is only the tip of the iceberg here. I have implemented audio fingerprinting that uses FFT just in one part of the process - and it ONLY matches two audibly same pieces of sound - your problem is far greater than that. In any case, you'll probably end up using some server-side solution, where your android device will only GET the audio and send it to the server, which will do the comparison. Investigate shazaam, playkontrol or soundhound... — Daniel Mošmondor, Jul 28 '12 at 23:21
BTW, you want that 'bark' to be recognizable for different dogs or just one dog? — Daniel Mošmondor, Jul 28 '12 at 23:25
Daniel is correct, there is a lot more work to do. There are lots of FFT routines out there, so you don't have to code that. But what you are trying to do is not easy! If it were all computers would have voice input. I would follow Daniels suggestions. — cliff2310, Jul 29 '12 at 00:57
Daniel, it is only for just one single dog. Thank you everyone for the help — pindleskin, Jul 29 '12 at 12:01
@pindleskin did you find any thing for this matching two audio files ? — PriyankaChauhan, Oct 05 '16 at 08:09

Bjorn Roche · Accepted Answer · 2012-07-30T02:10:29.300

There is no simple answer to your problem. However, for starters, you might look into how audio fingerprinting works. This paper is an excellent start written by the creators of shazam:

http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf

I'm not sure how well that approach would work for dog barking, but there are some concepts there that might prove useful.

Another thing to look into is how the FFT works. Here's a tutorial with code that I wrote for pitch tracking, which is one way to use the FFT. You are looking more at how the tone and pitch interact with the formant structure of a given dog. So parameters you'll want to derive might include fundamental pitch (which, alone, might be enough to distinguish whining from other kinds of barks), and ratio of fundamental pitch to higher harmonics, which would help identify how agressive the bark is (I'm guessing a bit here):

http://blog.bjornroche.com/2012/07/frequency-detection-using-fft-aka-pitch.html

Finally, you might want to do some research into basic speech recognition and speech processing, as there will be some overlap. Wikipedia will probably be enough to get you started.

EDIT: oh, also, once you've identified some parameters to use for comparison, you'll need a way to compare your multiple parameters to your database of sounds with multiple parameters. I don't think the techniques in the shazam article will work. One thing you could try is Logistic Regression. There are other options, but this is probably the simplest.

score 1 · Answer 2 · edited Apr 01 '22 at 22:21

1

I'd check out Google's open source lib musicg API: http://code.google.com/p/musicg/

It's Java so it works in Android and it gives similarity metrics for two audio files.

But it's compatible only with .wav files.

edited Apr 01 '22 at 22:21

NeroVero

13
1
4

answered Sep 11 '12 at 22:24

Karim Varela

7,562
10
53
78

can you suggest me any proper link or code sample for integrating this kind of feature. I am also facing the same problem – Chirag Solanki Mar 30 '17 at 08:07

Matching two audio files

2 Answers2

Linked