Note Onset Detection using Spectral Difference

Question

Im fairly new to onset detection. I read some papers about it and know that when working only with the time-domain, it is possible that there will be a large number of false-positives/negatives, and that it is generally advisable to work with either both the time-domain and frequency-domain or the frequency domain.

Regarding this, I am a bit confused because, I am having trouble on how the spectral energy or the results from the FFT bin can be used to determine note onsets. Because, aren't note onsets represented by sharp peaks in amplitude?

Can someone enlighten me on this? Thank you!

Adam Hess · Answer 1 · 2022-11-14T16:12:07.607

This is the easiest way to think about note onset:

think of a music signal as a flat constant signal. When and onset occurs you look at it as a large rapid CHANGE in signal (a positive or negative peak)

What this means in the frequency domain:

the FT of a constant signal is, well, CONSTANT! and flat

When the onset event occurs there is a rapid increase in spectrial content.

While you may think "Well you're actually talking about the peak of the onset right?" not at all. We are not actually interested in the peak of the onset, but rather the rising edge of the signal. When there is a sharp increase in the signal, the high frequency content increases.

one way to do this is using the spectrial difference function:

take your time domain signal and cut it up into overlaping strips (typically 50% overlap)
apply a hamming/hann window (this is to reduce spectrial smudging) (remember cutting up the signal into windows is like multiplying it by a pulse, in the frequency domain its like convolving the signal with a sinc function)
Apply the FFT algorithm on two sucessive windows
For each DFT bin, calculate the difference between the Xn and Xn-1 bins if it is negative set it to zero square the results and sum all th bins together
repeat till end of signal.
look for peaks in signal using median thresholding and there are your onset times!

Source:

https://adamhess.github.io/Onset_Detection_Nov302011.pdf

and http://www.elec.qmul.ac.uk/people/juan/Documents/Bello-TSAP-2005.pdf

hotpaw2 · Accepted Answer · 2011-06-24T15:30:28.950

You can look at sharp differences in amplitude at a specific frequency as suspected sound onsets. For instance if a flute switches from playing a G5 to playing a C, there will be a sharp drop in amplitude of the spectrum at around 784 Hz.

If you don't know what frequency to examine, the magnitude of an FFT vector will give you the amplitude of every frequency over some window in time (with a resolution dependent on the length of the time window). Pick your frequency, or a bunch of frequencies, and diff two FFTs of two different time windows. That might give you something that can be used as part of a likelihood estimate for a sound onset or change somewhere between the two time windows. Sliding the windows or successive approximation of their location in time might help narrow down the time of a suspected note onset or other significant change in the sound.

score 1 · Answer 3 · answered Mar 15 '12 at 15:36

"Because, aren't note onsets represented by sharp peaks in amplitude?" A: Not always. On percussive instruments (including piano) this is true, but for violin, flute, etc. notes often "slide" into each other as frequency changes without sharp amplitude increases. If you stick to a single instrument like the piano onset detection is do-able. Generalized onset detection is a much more difficult problem. There are about a dozen primitive features that have been used for onset detection. Once you code them, you still have to decide how best to use them.

Note Onset Detection using Spectral Difference

3 Answers3