0

I would like to extract a one-dimensional single vector from a soundtrack simply representing its "volume" or "intensity" (I am not sure about this terminology) at a given time.

Taking for example an available sample:

wget https://freewavesamples.com/files/Ensoniq-ESQ-1-Sympy-C4.wav

And converting it to mono:

ffmpeg -i Ensoniq-ESQ-1-Sympy-C4.wav -acodec pcm_s16le -ac 1 -ar 44100 audio_test.wav

I gathered from a related Q&A thread this way to visualize the sound wave:

from scipy.io.wavfile import read
import matplotlib.pyplot as plt

input_data = read("audio_test.wav")
audio = input_data[1]

plt.plot(audio)
plt.ylabel("Amplitude")
plt.xlabel("Time")  
plt.title("Sample Wav")
plt.show()

simple wave plot

The "positive" and "negative" sides are quite symmetrical but not completely. Is there a way to merge them into a single "positive" line ? If yes, how can I extract such data points from the audio variable ?

Thanks very much for your help !

sereizam
  • 2,048
  • 3
  • 20
  • 29
  • 1
    It seems that what you would like to detect is the *envelope* of the signal. Specifically, the positive envelope. https://stackoverflow.com/questions/34235530/python-how-to-get-high-and-low-envelope-of-a-signal An example is given here, using scipy. You could easily write some code to create the positive envelope, and then display it. (Not marking as duplicate since there is a question of terminology) – anerisgreat Nov 06 '19 at 10:09
  • Very helpful, thanks, I added my solution below. – sereizam Nov 06 '19 at 11:55

1 Answers1

1

Following @anerisgreat and a colleague's advices, I reached this solution (which make more sense on a bigger audio sample):

wget https://file-examples.com/wp-content/uploads/2017/11/file_example_WAV_10MG.wav
ffmpeg -i file_example_WAV_10MG.wav -acodec pcm_s16le -ac 1 -ar 44100 audio_test.wav
from scipy.io.wavfile import read
import matplotlib.pyplot as plt

def positive_enveloppe(wav_dat):
    freq = wav_dat[0]
    pts = np.absolute(wav_dat[1])
    pos_env = np.zeros(len(pts) // freq + int(bool(len(pts) % freq)))

    env_idx, pts_idx = 0, 0
    while pts_idx < len(pts):
        sub_ar = pts[pts_idx:pts_idx+freq]
        mov_avg = np.mean(sub_ar)
        pos_env[env_idx] = mov_avg
        pts_idx += freq
        env_idx += 1

    return pos_env

input_data = read("audio_test.wav")
enveloppe_data = positive_enveloppe(input_data)
plt.plot(enveloppe_data)
plt.show()

Yielding:

positive enveloppe

sereizam
  • 2,048
  • 3
  • 20
  • 29