7

I am writing a python function to return the loudness of a .wav file. RMS seems to be best the metric for this, Detect and record a sound with python. audioop.rms() does the trick, but I'd like to avoid audioop as a dependency, and I already import numpy. but I'm not getting the same RMS values, and would appreciate help in understanding what is going on.

From the audioop page, it says that the rms calculation is just what you'd expect, namely sqrt(sum(S_i^2)/n), where, S_i is the i-th sample of the sound. Seems like its not rocket science.

To use numpy, I first convert the sound to a numpy array, and always see identical min / max, and the same length of the data (so the conversion seems fine).

>>> d = np.frombuffer(data, np.int16)
>>> print (min(d), max(d)), audioop.minmax(data,2)
(-2593, 2749) (-2593, 2749)

but I get very different RMS values, not even ball-park close:

>>> numpy_rms = np.sqrt(sum(d*d)/len(d))
>>> print numpy_rms, audioop.rms(data, 2)
41.708703254716383, 120

The difference between them is variable, no obvious pattern I can see, eg, I also get:

63.786714248938772, 402
62.779300661773405, 148

My numpy RMS code gives identical output to the one here: Numpy Root-Mean-Squared (RMS) smoothing of a signal

I don't see where I am going wrong, but something is off. Any help much appreciated.


EDITED / UPDATE:

In case its useful, here's the code I ended up with. Its not quite as fast as audioop but is still plenty fast, good enough for my purpose. Of note, using np.mean() makes it MUCH faster (~100x) than my version using python sum().

def np_audioop_rms(data, width):
    """audioop.rms() using numpy; avoids another dependency for app"""
    #_checkParameters(data, width)
    if len(data) == 0: return None
    fromType = (np.int8, np.int16, np.int32)[width//2]
    d = np.frombuffer(data, fromType).astype(np.float)
    rms = np.sqrt( np.mean(d**2) )
    return int( rms )
Community
  • 1
  • 1
jrgray
  • 415
  • 3
  • 9
  • thanks but no go. using: "d = np.frombuffer(data, np.short)" gives exactly the same behavior. – jrgray Mar 19 '12 at 00:39
  • sorry, I've deleted the comment before I saw the reply. btw, you could use [`window_rms()`](http://stackoverflow.com/questions/8245687/numpy-root-mean-squared-rms-smoothing-of-a-signal) if you use `double` to calculate `np.power(a, 2)` e.g., by changing it to `np.power(a, 2.)`. – jfs Mar 19 '12 at 01:00
  • why can't you use `audioop`? How large is the data? Can you use a C extension? – jfs Mar 19 '12 at 15:15
  • I could use audioop, but doing so would mean packaging another dependency for the application (= larger download, and more hassle when building the release). we already include numpy, and don't need every last microsecond. probably only short sound clips. – jrgray Mar 19 '12 at 17:01
  • audioop is in stdlib or am i missing something? you could use `np.power(np.frombuffer(data, dtype=fromType), 2.0).mean()**.5` to avoid making a copy with `.astype()` (here `power` creates float array). – jfs Mar 20 '12 at 00:41
  • I thought it was in stdlib too, but get an import error (when using the version of python we package, 2.6.6, as packaged). – jrgray Mar 20 '12 at 08:14
  • dumb question, how do you get the data width where data = u.read(8192) – Ossama Dec 20 '17 at 19:55

2 Answers2

16

Perform calculations using double as in audioop.rms() code:

d = np.frombuffer(data, np.int16).astype(np.float)

Example

>>> import audioop, numpy as np
>>> data = 'abcdefgh'
>>> audioop.rms(data, 2)
25962
>>> d = np.frombuffer(data, np.int16)
>>> np.sqrt((d*d).sum()/(1.*len(d)))
80.131142510262507
>>> d = np.frombuffer(data, np.int16).astype(np.float)
>>> np.sqrt((d*d).sum()/len(d))
25962.360851817772
jfs
  • 399,953
  • 195
  • 994
  • 1,670
0

From matplotlib.mlab:

def rms_flat(a):
    """
    Return the root mean square of all the elements of *a*, flattened out.
    """
    return np.sqrt(np.mean(np.absolute(a)**2))
endolith
  • 25,479
  • 34
  • 128
  • 192