Algorithm to remove vocal from audio file

Question

I know this has already been posted more than 10 years ago but I want to believe that some progress has been made on this side. (we have Deepfake nowadays, so much progress on the AI side).

I tried some tutorials with audacity but was highly disappointed with the result (to be fair the resulting output is not that bad, but not good enough for prod).

What reputable algorithm could I use to process myself a mp3 file and remove the vocals while preserving the drums and centered instruments, and removing vocal echo?

Are you looking for a ready-made software or an algorithm to implement in code? — Leo Aimone, Mar 28 '21 at 16:03
I think you will likely get the best response at https://dsp.stackexchange.com/ or DAFX papers E.g http://www.dafx.de/paper-archive/2013/papers/40.dafx2013_submission_7.pdf — fdcpp, Mar 28 '21 at 16:44
@fdcpp thats interesting but it is a paper from 2013, nothing since then? — Antonin GAVREL, Mar 28 '21 at 19:38
That was just a cherry picked example, a little more searching may yield some recent papers — fdcpp, Mar 28 '21 at 19:40

score 0 · Answer 1 · answered Apr 04 '21 at 11:37

This task is known in the community as "Vocal Source Separation" or "Vocal Signal Separation" or "Singing Voice Source Separation", which are specialized "Music Source Separation" tasks, again an example of the more general "Source Separtion" task.

Here are some papers: Music Source Separation. One of the most actively developed open source solutions is Spleeter, which has been used commercially in various audio products. There is an online tool based on it, you can try it out at Splitter.ai. The "2 stem" version will give you one track with vocals, and one track with everything else.

Algorithm to remove vocal from audio file

1 Answers1