0

I am trying to enhance an audio file (3:16 minutes in length, available here) using Speechbrain. If I run the code below (from this tutorial), I get the error OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 39.59 GiB total capacity; 33.60 GiB already allocated; 3.19 MiB free; 38.06 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF.

What is the recommended way to fix the issue? Should I just cut the audio file in pieces?

from speechbrain.pretrained import SepformerSeparation as separator
import torchaudio

model = separator.from_hparams(source="speechbrain/sepformer-wham-enhancement",         
    savedir='pretrained_models/sepformer-wham-enhancement', run_opts={"device":"cuda"})

est_sources = model.separate_file(path=audio_file) 

torchaudio.save("enhanced_wham.wav", est_sources[:, :, 0].detach().cpu(), 8000)
albus_c
  • 6,292
  • 14
  • 36
  • 77

0 Answers0