2

I have large wav files (~4GB each). Since I've a limitation on my deployment server that I can't use RAM more than 500 MB per process, I want to be able to load and process only chunks of the audio file and then later read and append them all, say like a text file.

I looked into pydub, but it seems that it would load the entire file before I can chop off a smaller chunk to process (correct me if I'm wrong). Same is the case with scipy.io.wavfile.read. I want to be able to read chunks of the large files, process them, and put them back (ideally appending to the previously processed chunks on the hard drive).

Most of the available SO answers that I could find already assume that I can load the large file into main memory.

How to split a .wav file into multiple .wav files?

Reading *.wav files in Python

DaveIdito
  • 1,546
  • 14
  • 31

1 Answers1

4

There are a few packages you may want to look into for handling audio: commonly soundfile is used for I/O, as is librosa. The 'sampling rate' AKA 'frame rate' is the number of audio samples per second, commonly written in kHz, but in software just given in Hz.

There's also a dedicated Sound Design StackExchange which you may find more fruitful to search.

Taking a section of a file is known as 'seeking', and the soundfile.SoundFile class supports it.

The idea is you move the position of the 'cursor' to a particular frame, SoundFile.seek(pos), then read in some frames, SoundFile.read(n_frames), after which the position of the cursor will be moved along by that many frames, which you can obtain with SoundFile.tell().

Here's an example of accessing a part of a wav file:

import soundfile as sf

def read_audio_section(filename, start_time, stop_time):
    track = sf.SoundFile(filename)

    can_seek = track.seekable() # True
    if not can_seek:
        raise ValueError("Not compatible with seeking")

    sr = track.samplerate
    start_frame = sr * start_time
    frames_to_read = sr * (stop_time - start_time)
    track.seek(start_frame)
    audio_section = track.read(frames_to_read)
    return audio_section, sr

...and to write that to file you just use soundfile.write (note: a function in the package, not a method of the soundfile.SoundFile class)

def extract_as_clip(input_filename, output_filename, start_time, stop_time):
    audio_extract, sr = read_audio_section(input_filename, start_time, stop_time)
    sf.write(output_filename, audio_extract, sr)
    return
Louis Maddox
  • 5,226
  • 5
  • 36
  • 66