I lack a background in acoustics, but need to work on a data-science project in acoustics.
Please help me understand how to correctly interpret what amplitude of waveform represent, correctly set the metrics, and possibly set correct sampling rate when doing analysis.
Consider this example.
I have a waveform file of an animal recorded at 250000 sampling rate.
You can listen to it here:
https://www.whyp.it/tracks/75747/bat-120614013233718915?token=Lmt6M (original audio)
data, rate = librosa.core.load('my_file.wav')
# data is numpy array
# rate is 250000
I am learning the amplitude units can be decibel or voltage; in case of wave files, amplitude is represented by 16-bits integers : from -32768 to 32767, where 0 represents no sound (silence).
I load the file with librosa, and amplitude should get normalised between [-1, 1].
When I plot the data, I see y-axis in between of -0.4, and 0.4 as max values.
If I extract a segment (see any interval between the red lines, above), which is about at 0, and plot it, now the y-axis ranges between -.008 and +0.006.
fig, ax = plt.subplots(nrows=1,ncols=1, figsize=(1,4))
plt.plot(segment, color='black')
plt.show()
Audio(data = segment, rate = 192000)
But in both cases, both the file and segments are perfectly audible. I was expecting not to be able to perceive anything for segments with amplitude about zero...
In both cases, in order to hear something, I resample to 192000, which appears to be the maximum value supported by my browser (I am using jupyter on local browser).
Now, a few questions because I feel I lack basic concepts :
- what is the metric of y-axis of the waveform in wav format: decibels ? voltage?
- what is the relation between amplitude and volume : why can I hear sound, when its amplitude is around 0 ?