Mixing two audio files together with python

Question

I have two wav files that I want to mix together to form one wav file. They are both the same samples format etc...

Been searching google endlessly.

I would prefer to do it using the wave module in python.

How can this be done?

score 41 · Answer 1 · answered Dec 08 '12 at 22:13

41

You can use the pydub library (a light wrapper I wrote around the python wave module in the std lib) to do it pretty simply:

from pydub import AudioSegment

sound1 = AudioSegment.from_file("/path/to/my_sound.wav")
sound2 = AudioSegment.from_file("/path/to/another_sound.wav")

combined = sound1.overlay(sound2)

combined.export("/path/to/combined.wav", format='wav')

answered Dec 08 '12 at 22:13

Jiaaro

74,485
42
169
190

1

When I tried to do this only the sound that I was overlaying a different sound on was being heard. Is there a way to get around that? – ArjunSahlot Jan 05 '21 at 01:13

score 10 · Answer 2 · answered Oct 28 '10 at 06:41

10

A python solution which requires both numpy and audiolab, but is fast and simple:

import numpy as np
from scikits.audiolab import wavread

data1, fs1, enc1 = wavread("file1.wav")
data2, fs2, enc2 = wavread("file2.wav")

assert fs1 == fs2
assert enc1 == enc2
result = 0.5 * data1 + 0.5 * data2

If sampling rate (fs*) or encoding (enc*) are different, you may need some audio processing (the assert are strictly speaking too strong, as wavread can handle some cases transparantly).

answered Oct 28 '10 at 06:41

David Cournapeau

78,318
8
63
70

Shouldn't you `assert len(data1) == len(data2)` too? And you could also use `scipy` to read a wave file: `from scipy.io import wavfile` and then `fs, data = wavfile.read("file.wav")`. – Lewistrick Jan 23 '19 at 10:14

score 10 · Answer 3 · answered Apr 16 '19 at 07:08

LIBROSA SOLUTION

import librosa
import IPython as ip

y1, sample_rate1 = librosa.load(audio1, mono=True)
y2, sample_rate2 = librosa.load(audio2, mono=True)

# MERGE
librosa.display.waveplot((y1+y2)/2, sr=int((sample_rate1+sample_rate2)/2))

# REPRODUCE
ip.display.Audio((y1+y2)/2, rate=int((sample_rate1+sample_rate2)/2))

score 8 · Answer 4 · answered Dec 30 '18 at 14:40

You guys like numpy, no? Below is a solution that depends on wave and numpy. Raw bytes in two files './file1.wav' and './file2.wav' are added. It's probably good to apply np.clip to mix before converting back to int-16 (not included).

import wave
import numpy as np
# load two files you'd like to mix
fnames =["./file1.wav", "./file2.wav"]
wavs = [wave.open(fn) for fn in fnames]
frames = [w.readframes(w.getnframes()) for w in wavs]
# here's efficient numpy conversion of the raw byte buffers
# '<i2' is a little-endian two-byte integer.
samples = [np.frombuffer(f, dtype='<i2') for f in frames]
samples = [samp.astype(np.float64) for samp in samples]
# mix as much as possible
n = min(map(len, samples))
mix = samples[0][:n] + samples[1][:n]
# Save the result
mix_wav = wave.open("./mix.wav", 'w')
mix_wav.setparams(wavs[0].getparams())
# before saving, we want to convert back to '<i2' bytes:
mix_wav.writeframes(mix.astype('<i2').tobytes())
mix_wav.close()

A great code! It is better to divide the result in mix line: ```mix = (samples[0][:n] + samples[1][:n])/2``` — mohammad, Apr 30 '23 at 15:11

Ponkadoodle · Answer 5 · 2010-10-28T02:45:29.280

this is very dependent of the format these are in. Here's an example of how to do it assuming 2 byte wide, little-endian samples:

import wave

w1 = wave.open("/path/to/wav/1")
w2 = wave.open("/path/to/wav/2")

#get samples formatted as a string.
samples1 = w1.readframes(w1.getnframes())
samples2 = w2.readframes(w2.getnframes())

#takes every 2 bytes and groups them together as 1 sample. ("123456" -> ["12", "34", "56"])
samples1 = [samples1[i:i+2] for i in xrange(0, len(samples1), 2)]
samples2 = [samples2[i:i+2] for i in xrange(0, len(samples2), 2)]

#convert samples from strings to ints
def bin_to_int(bin):
    as_int = 0
    for char in bin[::-1]: #iterate over each char in reverse (because little-endian)
        #get the integer value of char and assign to the lowest byte of as_int, shifting the rest up
        as_int <<= 8
        as_int += ord(char) 
    return as_int

samples1 = [bin_to_int(s) for s in samples1] #['\x04\x08'] -> [0x0804]
samples2 = [bin_to_int(s) for s in samples2]

#average the samples:
samples_avg = [(s1+s2)/2 for (s1, s2) in zip(samples1, samples2)]

And now all that's left to do is convert samples_avg back to a binary string and write that to a file using wave.writeframes. That's just the inverse of what we just did, so it shouldn't be too hard to figure out. For your int_to_bin function, you'll probably what to make use of the function chr(code), which returns the character with the character code of code (opposite of ord)

Why Average of samples not Max? As far as I understand you can normally hear the loudest amplitude at given time — Evalds Urtans, Jul 20 '18 at 08:32

score 0 · Answer 6 · edited Mar 20 '17 at 10:18

0

Try the Echo Nest Remix API:

from echonest import audio
from util import *

def mixSound(fname1,fname2,f_out_name):

  f1 = audio.AudioData(fnem1)
  f2 = audio.AudioData(fnem2)


  f_out = audio.mix(f1,f2)
  f_out.encode(foutnem, True)

If it complains about codecs, check https://superuser.com/questions/196857/how-to-install-libmp3lame-for-ffmpeg.

edited Mar 20 '17 at 10:18

Community

1
1

answered Mar 24 '12 at 18:03

TTT

6,505
10
56
82

Mixing two audio files together with python

6 Answers6

Linked