25

I have some audio data loaded in a numpy array and I wish to segment the data by finding silent parts, i.e. parts where the audio amplitude is below a certain threshold over a period in time.

An extremely simple way to do this is something like this:

values = ''.join(("1" if (abs(x) < SILENCE_THRESHOLD) else "0" for x in samples))
pattern = re.compile('1{%d,}'%int(MIN_SILENCE))                                                                           
for match in pattern.finditer(values):
   # code goes here

The code above finds parts where there are at least MIN_SILENCE consecutive elements smaller than SILENCE_THRESHOLD.

Now, obviously, the above code is horribly inefficient and a terrible abuse of regular expressions. Is there some other method that is more efficient, but still results in equally simple and short code?

jtlz2
  • 7,700
  • 9
  • 64
  • 114
pafcu
  • 7,808
  • 12
  • 42
  • 55

8 Answers8

42

Here's a numpy-based solution.

I think (?) it should be faster than the other options. Hopefully it's fairly clear.

However, it does require a twice as much memory as the various generator-based solutions. As long as you can hold a single temporary copy of your data in memory (for the diff), and a boolean array of the same length as your data (1-bit-per-element), it should be pretty efficient...

import numpy as np

def main():
    # Generate some random data
    x = np.cumsum(np.random.random(1000) - 0.5)
    condition = np.abs(x) < 1
    
    # Print the start and stop indices of each region where the absolute 
    # values of x are below 1, and the min and max of each of these regions
    for start, stop in contiguous_regions(condition):
        segment = x[start:stop]
        print start, stop
        print segment.min(), segment.max()

def contiguous_regions(condition):
    """Finds contiguous True regions of the boolean array "condition". Returns
    a 2D array where the first column is the start index of the region and the
    second column is the end index."""

    # Find the indicies of changes in "condition"
    d = np.diff(condition)
    idx, = d.nonzero() 

    # We need to start things after the change in "condition". Therefore, 
    # we'll shift the index by 1 to the right.
    idx += 1

    if condition[0]:
        # If the start of condition is True prepend a 0
        idx = np.r_[0, idx]

    if condition[-1]:
        # If the end of condition is True, append the length of the array
        idx = np.r_[idx, condition.size] # Edit

    # Reshape the result into two columns
    idx.shape = (-1,2)
    return idx

main()
David Parks
  • 30,789
  • 47
  • 185
  • 328
Joe Kington
  • 275,208
  • 71
  • 604
  • 463
  • This results in an impressive 20x speedup! It doesn't take into account the minimum length, but that's fairly easy to add. Only problem is the increased memory usage which makes it infeasible to use in some situation, so I think I will make use of this by default and add an option to use another algorithm when low on memory. – pafcu Dec 21 '10 at 05:46
  • 1
    With numpy 1.9, I get a `DeprecationWarning: numpy boolean subtract (the binary - operator) is deprecated` using np.diff on the boolean condition. I replaced this line with `d = np.subtract(condition[1:], condition[:-1], dtype=np.float)` to avoid the issue. – daryl Sep 29 '14 at 15:30
  • 2
    @daryl - Thanks for noticing the change! It might be clearer to do `d = np.diff(condition.astype(int))` instead, though that's mostly a matter of personal preference. – Joe Kington Sep 29 '14 at 19:10
  • Update on the deprecation: `numpy.diff` got specific defined behaviour for bools, which keeps this working, and it no longer shows a warning. So it appears safe to use the code in the answer again without the modifications from comments. Discussion here: https://github.com/numpy/numpy/issues/9251 – Thomas K Jun 24 '18 at 17:29
  • Just a heads up - I use part of your code with attributation for an answer to another question - see [https://stackoverflow.com/a/58039852/7505395](https://stackoverflow.com/a/58039852/7505395) – Patrick Artner Sep 21 '19 at 11:34
  • By adding the prepend and append arguments to `diff` you can skip the conditional logic, i.e. `np.diff(condition, prepend=0, append=0)` – Le Frite Jun 14 '22 at 21:02
9

There is a very convenient solution to this using scipy.ndimage. For an array:

a = np.array([1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0])

which can be the result of a condition applied to another array, finding the contiguous regions is as simple as:

regions = scipy.ndimage.find_objects(scipy.ndimage.label(a)[0])

Then, applying any function to those regions can be done e.g. like:

[np.sum(a[r]) for r in regions]
David Parks
  • 30,789
  • 47
  • 185
  • 328
Andrzej Pronobis
  • 33,828
  • 17
  • 76
  • 92
7

Slightly sloppy, but simple and fast-ish, if you don't mind using scipy:

from scipy.ndimage import gaussian_filter
sigma = 3
threshold = 1
above_threshold = gaussian_filter(data, sigma=sigma) > threshold

The idea is that quiet portions of the data will smooth down to low amplitude, and loud regions won't. Tune 'sigma' to affect how long a 'quiet' region must be; tune 'threshold' to affect how quiet it must be. This slows down for large sigma, at which point using FFT-based smoothing might be faster.

This has the added benefit that single 'hot pixels' won't disrupt your silence-finding, so you're a little less sensitive to certain types of noise.

Andrew
  • 2,842
  • 5
  • 31
  • 49
3

I haven't tested this but you it should be close to what you are looking for. Slightly more lines of code but should be more efficient, readable, and it doesn't abuse regular expressions :-)

def find_silent(samples):
    num_silent = 0
    start = 0
    for index in range(0, len(samples)):
        if abs(samples[index]) < SILENCE_THRESHOLD:
            if num_silent == 0:
                start = index
            num_silent += 1
        else:
            if num_silent > MIN_SILENCE:
                yield samples[start:index]
            num_silent = 0
    if num_silent > MIN_SILENCE:
        yield samples[start:]

for match in find_silent(samples):
    # code goes here
Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
  • 1
    Your code looks good, except that if the piece of silence is at the end of samples, then it won't be found. You need to check after the for loop for it. – Justin Peel Dec 20 '10 at 22:48
2

This should return a list of (start,length) pairs:

def silent_segs(samples,threshold,min_dur):
  start = -1
  silent_segments = []
  for idx,x in enumerate(samples):
    if start < 0 and abs(x) < threshold:
      start = idx
    elif start >= 0 and abs(x) >= threshold:
      dur = idx-start
      if dur >= min_dur:
        silent_segments.append((start,dur))
      start = -1
  return silent_segments

And a simple test:

>>> s = [-1,0,0,0,-1,10,-10,1,2,1,0,0,0,-1,-10]
>>> silent_segs(s,2,2)
[(0, 5), (9, 5)]
job
  • 9,003
  • 7
  • 41
  • 50
  • This seems to be about 25% faster than the regexp-based solution. Nice. Now it only takes 9 minutes :-) – pafcu Dec 20 '10 at 23:23
2

another way to do this quickly and concisely:

import pylab as pl

v=[0,0,1,1,0,0,1,1,1,1,1,0,1,0,1,1,0,0,0,0,0,1,0,0]
vd = pl.diff(v)
#vd[i]==1 for 0->1 crossing; vd[i]==-1 for 1->0 crossing
#need to add +1 to indexes as pl.diff shifts to left by 1

i1=pl.array([i for i in xrange(len(vd)) if vd[i]==1])+1
i2=pl.array([i for i in xrange(len(vd)) if vd[i]==-1])+1

#corner cases for the first and the last element
if v[0]==1:
  i1=pl.hstack((0,i1))
if v[-1]==1:
  i2=pl.hstack((i2,len(v)))

now i1 contains the beginning index and i2 the end index of 1,...,1 areas

Brano
  • 21
  • 1
1

@joe-kington I've got about 20%-25% speed improvement over np.diff / np.nonzero solution by using argmax instead (see code below, condition is boolean)

def contiguous_regions(condition):
    idx = []
    i = 0
    while i < len(condition):
        x1 = i + condition[i:].argmax()
        try:
            x2 = x1 + condition[x1:].argmin()
        except:
            x2 = x1 + 1
        if x1 == x2:
            if condition[x1] == True:
                x2 = len(condition)
            else:
                break
        idx.append( [x1,x2] )
        i = x2
    return idx

Of course, your mileage may vary depending on your data.

Besides, I'm not entirely sure, but i guess numpy may optimize argmin/argmax over boolean arrays to stop searching on first True/False occurrence. That might explain it.

user2154321
  • 81
  • 1
  • 2
0

I know I'm late to the party, but another way to do this is with 1d convolutions:

np.convolve(sig > threshold, np.ones((cons_samples)), 'same') == cons_samples

Where cons_samples is the number of consecutive samples you require above threshold

DankMasterDan
  • 1,900
  • 4
  • 23
  • 35