How does Shazam / soundhound work?

Question

I am interested in how does shazam or soundhound work.

How does it record a voice and makes a fingerprint that is so simmilar to their database of fingerprints that it finds a match?

I am about to start writing some software in C/C++ but not sure which libraries to use, seen there is some Speech SDK from Microsoft. Dont you have some suggestions, where to start 'understanding' the process of analyzing voice and playing with it?

I would be thankful for every tip or idea you will share :]

I'm also interested in these algorithms, though you shall decide, whether you want to know, what the algorithm is, or which libraries to use. These are two different questions and the second one qualifies to close the question, I think (the answer is, "the one, which suits your needs"). — Spook, Mar 29 '13 at 11:49
Well agree to untag C/C++. I would like to know if someone has some knowledge in Spectrogram, Acoustic fingerprint and so on and could explain in few sentences which way to go :] — Wiggler Jtag, Mar 29 '13 at 11:53

score 3 · Answer 1 · answered Mar 29 '13 at 12:56

3

There are some existing pieces of software you could look at:

AudioDB: C++

mercurial repo: http://code.soundsoftware.ac.uk/projects/audiodb/

Sonic Visualiser: SV Libraries are written in C++ using Qt4

https://code.soundsoftware.ac.uk/projects/sonic-visualiser

answered Mar 29 '13 at 12:56

hyponym

777
1
5
9

There are also a lot of other projects there too e.g. **CAMEL** (Content-based Audio and Music Extraction Library) is an easy-to-use C++ framework developed for content-based audio and music analysis. The framework provides a set of tools for easy Segmentation, Feature Extraction, Domain Extraction, etc.https://code.soundsoftware.ac.uk/projects/camel – hyponym Mar 29 '13 at 13:00

How does Shazam / soundhound work?

1 Answers1