1

I have generated audio with python. Yet it ends in this:

enter image description here

So I wonder how to remove such a loud random noize using python librosa/pydub?

In other words how to detect where noise starts and ends. We know that there is noise at the end of the recording - so the question is how to find where it starts and cut it out in python?

DuckQueen
  • 772
  • 10
  • 62
  • 134
  • You need to deamplify the sound, there are modules like [pydub](https://stackoverflow.com/questions/43679631/python-how-to-change-audio-volume) that can do this. –  Jul 18 '21 at 22:17
  • `from pydub import AudioSegment`, `song = AudioSegment.from_mp3("your_song.mp3")`, then do `quieter_song = song - 3`, finally to export `quieter_song.export("louder_song.mp3", format='mp3')` –  Jul 18 '21 at 22:24
  • I want to remove the noise part - not reduce the overall volume... – DuckQueen Jul 18 '21 at 22:33
  • "I have generated audio" what datatype do you have? array of pcm data? numpy array? wav file? etc? – Aaron Jul 19 '21 at 04:01
  • wav file loaded as np array – DuckQueen Jul 19 '21 at 04:25
  • You are asking the wrong question. If you are generating the sound file, the question is: "How do I keep from generating random noise at the end of my function?" – RufusVS Jul 19 '21 at 04:25
  • @RufusVS nope - DNN generates that audio file so my hands are tight – DuckQueen Jul 19 '21 at 04:26
  • Not knowing what DNN is, I don't think I can help. People could provide better help if there were some kind of code and data set that they could reproduce this problem. – RufusVS Jul 19 '21 at 04:32
  • If your neural net model is generating noise the best you can do is simply truncate the audio. How you truncate is really up to you. – fdcpp Jul 19 '21 at 12:28
  • @fdcpp: that is my main question: how to detect where noise starts and ends (we know that there is noise at the end of recording - so the question is how to find where it starts and cut it out)? – DuckQueen Jul 19 '21 at 12:37
  • You may want to ret weak the question a little. A slightly more accurate title would perhaps be, _How to detect distorted / noise in audio?_. That question is then probably better posed at dsp.stackexchange.com. The second part is, _how to truncate audio?_. It would be a good idea to make sure you know how to do this regardless of noise. Mock up a script / function that will intake an audio file, truncate it after a specified duration, then save a new audio file. Your task is then to get a sample index from your noise detection and use that value to truncate with your new function – fdcpp Jul 19 '21 at 12:50
  • For compound problems like these, you'll find the response on SO lack as that it not really its remit. It is a good idea to try and break up your problem first to avoid the community doing any heavy lifting. – fdcpp Jul 19 '21 at 12:51
  • What I don't understand is that you are obviously quite seasoned using SO and programming in general. The above advice really shouldn't need to be said at this point – fdcpp Jul 19 '21 at 12:55

1 Answers1

2

It seems like in your case a simple thresholding would work. To ensure we don't clip prematurely, we will require that at least k values exceed the threshold before truncating the audio.

import librosa
import numpy as np

def first_occ_index(w, n):
    # Borrowed from https://stackoverflow.com/questions/49693770/get-index-of-the-first-block-of-at-least-n-consecutive-false-values-in-boolean-a
    idx = np.flatnonzero(np.r_[True, w, True])
    lens = np.diff(idx) - 1
    return idx[(lens >= n).argmax()]

X, fs = librosa.load('your.audio')
threshold = 3 * X.std() # or e.g. 0.6 * X.max() - play with it
X_th = np.abs(X) < threshold
k = 20 # we require 20 consecutive values above the threshold 
idx_to_cut = first_occ_index(X_th, k)
my_audio = X[:idx_to_cut]
garbage = X[idx_to_cut:]
Lukasz Tracewski
  • 10,794
  • 3
  • 34
  • 53