0

I am doing quite some research on how I can separate music from an ad in order to get only the words mentioned in an ad. I have came across several approaches with librosa and pyaudio where it is discussed to set a high/low pass filter. I have tried this but the music remained in the ad.

Another approach I would dig in is speaker diarization. However, I do not know yet how to tackle the problem. There are some Deep Learning architectures available but they probably can't differentiate between music and non-music.

Does anyone has a better idea for this?

Cheers, Andi

Andi Maier
  • 914
  • 3
  • 9
  • 28
  • have you seen this? https://stackoverflow.com/questions/3673042/algorithm-to-remove-vocal-from-sound-track. – Peyman Apr 18 '19 at 11:23
  • Yes, I have but I need to remove the music and keep the vocal. The methods mentioned in the post work only for removing the vocal.s – Andi Maier Apr 18 '19 at 11:46
  • Adding a butterworth high pass filter has not changed anything :( – Andi Maier Apr 18 '19 at 14:16
  • your task is not an easy one. you should try many methods to see if one of them is working for you. look at this one too if you haven't https://www.researchgate.net/post/Is_there_an_algorithm_to_separate_human_voice_from_background_music_in_a_song – Peyman Apr 18 '19 at 15:24

0 Answers0