5

I am trying to measure the lead in and lead out of a wav file. Preferably the first and last 5 seconds or so. I am basically trying to assign a numerical value that means 'This song has a slow lead in' or 'This song has an abrupt end'.

My thinking has been to get the slope of the dB values, but I can't seem to find a linux command line tool that will give me dB values. I know they can be measured because Audacity has a waveform(db) view.

enter image description here

Basically I'm looking for a way to gather the data points to duplicate this graph so I can get the slope.

EDIT - working in java

Adam Schmidt
  • 452
  • 2
  • 15
jgreen
  • 1,132
  • 2
  • 14
  • 18
  • 1
    I agree with Isaac's answer that you prob won't find any commandline tools to do this, but the problem is pretty ill defined. Let me emphasize that you don't necessarily need dB values to determine if a song has an abrupt or slow start or end. In fact, the graph in audacity does not use dB. There are plenty of answers on SO about drawing the waveforms in Java, as well as, I believe, C and Javascript. – Bjorn Roche Nov 06 '12 at 04:29
  • @BjornRoche if the graph isn't using dB then why do they call it Waveform(dB)? (This is in addition to their other views: Waveform, Spectogram, Spectogram log(f), and Pitch(EAC)) If getting the values of a regular waveform is easier I'll look into that. – jgreen Nov 06 '12 at 18:39
  • notice that the units are in dB, but they aren't evenly spaced. – Bjorn Roche Nov 06 '12 at 19:53
  • @BjornRoche but isn't that just because dB is logarithmic? – jgreen Nov 06 '12 at 23:33
  • Going with @BjornRoche's suggestion that dBs aren't only relevant values, I started looking at getting amplitude. I'm still early in the process, but I've found that 'sox' is able to report the max amplitude in it's 'stat' call so I think I can use that to get what I want by looking at small .25 second cuts of the beginning and end of the track. – jgreen Nov 06 '12 at 23:53
  • The graph is linear, only the axis on the left is logarithmic. – Bjorn Roche Nov 07 '12 at 14:46

3 Answers3

9

I don't know of any command-line tools to do this, but writing a python script with this functionality is fairly simple using scipy libraries.

We can use scipy.io.wavfile to do the file IO, and then calculate the dB values ourselves (note that these won't necessarily be standard dB values, as those will depend on your speakers and volume settings).

First we get the file:

from scipy.io.wavfile import read
samprate, wavdata = read('file.wav')

We then split the file into chunks, where the number of chunks depends on how finely you want to measure the volume:

import numpy as np
chunks = np.array_split(wavdata, numchunks)

Finally, we compute the volume of each chunk:

dbs = [20*log10( sqrt(mean(chunk**2)) ) for chunk in chunks]

where dbs is now a list of dB values (again, not necessarily the true SPL sound levels) for each chunk of your file.

You can also easily split up the data in a different way, using overlapping chunks, etc.

References: - scipy.io.wavfile - dB (SPL)

Isaac
  • 3,586
  • 1
  • 18
  • 20
  • I know the OP asked about dB, but don't convert to dB to create a waveform or waveform overview. see for example: http://stackoverflow.com/questions/11091924/drawing-waveform-converting-to-db-squashes-it – Bjorn Roche Nov 06 '12 at 04:31
  • @Isaac unfortunately I don't know python. I've edited that I'm working in java. If your code sample is complete then I guess I just need to know how to get the scipy.io.wavfile module. – jgreen Nov 06 '12 at 18:35
1

For those of you trying to use the Python script in @Isaac's answer, I cleaned it up (I don't know Python either but I got it working). Some notes:

  1. You need python 3+ to use the statistics package (see this excellent article for Mac - https://opensource.com/article/19/5/python-3-default-mac)
  2. You need scipy - on a mac it's: pip install scipy
from scipy.io.wavfile import read
samprate, wavdata = read('intro.wav')
import numpy as np
import math
import statistics 
# basically taking a reading every half a second - the size of the data 
# divided by the sample rate gives us 1 second chunks so I chop 
# sample rate in half for half second chunks
chunks = np.array_split(wavdata, wavdata.size/(samprate/2))
dbs = [20*math.log10( math.sqrt(statistics.mean(chunk**2)) ) for chunk in chunks]
print(dbs)
William Neely
  • 1,923
  • 1
  • 20
  • 23
0

Here are just a few of the questions on SO about reading audio files and drawing waveforms and waveform overviews in Java.

How can I create a sound file in Java

How can I draw sound data from my wav file?

Java Program to create a PNG waveform for an audio file

Community
  • 1
  • 1
Bjorn Roche
  • 11,279
  • 6
  • 36
  • 58