(This is in in Python, and code would be great, but I'm primarily interested in the algorithm.)
I'm monitoring an audio stream (PyAudio) and looking for a series of 5 pops (see the bottom for a visualization). I'm read()ing the stream and getting the RMS value for the block that I've just read (similar to this question). My problem is that I'm not looking for a single event, but instead a series of events (pops) that have some characteristics but aren't nearly as boolean as I'd like. What's the most straightforward (and performant) way to detect these five pops?
The RMS function gives me a stream like this:
0.000580998485254, 0.00045098391298, 0.00751436443973, 0.002733730043, 0.00160775708652, 0.000847808804511
It looks a bit more useful if I round (a similar stream) for you:
0.001, 0.001, 0.018, 0.007, 0.003, 0.001, 0.001
You can see the pop in item 3, and presumably as it quiets down in item 4, and maybe the tail end was during a fraction of item 5.
I want to detect 5 of those in a row.
My naive approach is to: a) define what a pop is: Block's RMS is over .002. For at least 2 blocks but no more than 4 blocks. Started with silence and ends with silence.
Additionally, I'm tempted to define what silence is (to ignore the not quite loud but not quite silent blocks, but I'm not sure this makes more sense then considering 'pop' to be boolean).
b) Then have a state machine that keeps track of a bunch of variables and has a bunch of if statements. Like:
while True:
is_pop = isRMSAmplitudeLoudEnoughToBeAPop(stream.read())
if is_pop:
if state == 'pop':
#continuation of a pop (or maybe this continuation means
#that it's too long to be a pop
if num_pop_blocks <= MAX_POP_RECORDS:
num_pop_blocks += 1
else:
# too long to be a pop
state = 'waiting'
num_sequential_pops = 0
else if state == 'silence':
#possible beginning of a pop
state = 'pop'
num_pop_blocks += 1
num_silence_blocks = 0
else:
#silence
if state = 'pop':
#we just transitioned from pop to silence
num_sequential_pops += 1
if num_sequential_pops == 5:
# we did it
state = 'waiting'
num_sequential_pops = 0
num_silence_blocks = 0
fivePopsCallback()
else if state = 'silence':
if num_silence_blocks >= MAX_SILENCE_BLOCKS:
#now we're just waiting
state = 'waiting'
num_silence_blocks = 0
num_sequential_pops = 0
That code is not at all complete (and might have a bug or two), but illustrates my line of thinking. It's certainly more complex than I'd like it to be, which is why I'm asking for suggestions.