1

As a follow-up to my previous question, if I want my smartphone application to detect a certain musical note, and I only need to know whether the incoming sound is that musical note or not, with a certain amount of fuzziness, to allow the note to be off-key by x cents.

Given that, is there a superior method over others for speed and accuracy? That is, by knowing that the note you are looking for is, say, a #C3, how best to tell if that note is present or not? I'm assuming that looking for a single note would be easier than separating out all waveforms, and then looking at the results for the fundamental frequency.

In the responses to my original question, one respondent suggested that autocorrelation might work well if you know that the notes are within a certain range. I wonder if autocorrelation would then work even better, if you only have to check for the presence or absence of a certain note (+/- x cents).

Those methods being:

  • Kiss FFT
  • FFTW
  • Discrete Wavelet Transform
  • autocorrelation
  • zero crossing analysis
  • octave-spaced filters
  • DWT

Any thoughts would be appreciated.

Community
  • 1
  • 1
mahboudz
  • 39,196
  • 16
  • 97
  • 124
  • Can you describe the problem in more detail? You're going to be listening with the microphone and activate something when it hears a specific tone? Or are you trying to write a guitar tuner? Or are you trying to write a music transcriber? Is the tone going to be produced by a human voice, an instrument, a transmitter that you also control? Does it need to be a specific wave shape (sine, square), or anything with the right frequency? etc etc – endolith Oct 14 '09 at 21:26
  • I want to control my software with musical notes played by a (any I hope) musical instrument, or even possibly hummed. – mahboudz Oct 14 '09 at 22:31
  • Ahh. Well identifying a specific pitch (and not one of its harmonics or subharmonics) is not trivial. A trumpet, for instance, has stronger harmonics than the fundamental. http://cnx.org/content/m15456/latest/sub_concept-trumpet-spectrum.png But a lot of work has already been done for you. Just search for "pitch estimation", I guess. – endolith Oct 15 '09 at 16:10

1 Answers1

1

As you describe it, you just need to determine if a particular pitch is present. A very simple (fast) detector would just record the equivalent of one period of the waveform, then record another period and correlate them, like an oversimplified (single-lag) autocorrelation. If there's a high match, you know the waveform being recorded is repeating at around the same period, or a harmonic of it.

For instance, to detect 1 kHz, record 1 ms of audio (48 samples at 48 kHz), then record another 1 ms, and compare them (correlate = multiply all samples and sum). If they line up (correlation above some threshold), then you're listening to 1 kHz, 2 kHz, 3 kHz, or some other multiple. Doing several periods would give you more confidence on the match.

A true autocorrelation would tell you which harmonic, specifically, if that's important to you.

endolith
  • 25,479
  • 34
  • 128
  • 192
  • This sounds like a fast way to do it, but I would like to test any of 50 or so notes over 3 or 4 octaves. Actually, I would like to have some level of "fuzziness" as set by the user, so that the notes could be off by some amount of cents. Does that mean it might be better to just do an FFT and look at the resultant frequencies, rather than use autocorrelation. – mahboudz Oct 14 '09 at 22:38
  • Autocorrelation would be better, I think, since it matches the entire wave shape. With FFT you need to identify which of the maxima corresponds with the fundamental frequency of the wave. For large autocorrelations (matching low frequencies), you can actually speed up the autocorrelation by doing it via the FFT. :) But I think for low numbers of samples, a "naive" implementation can be fast. – endolith Oct 15 '09 at 16:05
  • And the "fuzziness" is built-in. If you're looking for 100 Hz and the wave is 98 Hz, it will still match, just not as well. – endolith Oct 15 '09 at 16:44