2

How can you play multiple (audio) byte arrays simultaneously? This "byte array" is recorded by TargetDataLine, transferred using a server.

What I've tried so far

Using SourceDataLine:

There is no way to play mulitple streams using SourceDataLine, because the write method blocks until the buffer is written. This problem cannot be fixed using Threads, because only one SourceDataLine can write concurrently.

Using the AudioPlayer Class:

ByteInputStream stream2 = new ByteInputStream(data, 0, data.length);
AudioInputStream stream = new AudioInputStream(stream2, VoiceChat.format, data.length);
AudioPlayer.player.start(stream);

This just plays noise on the clients.

EDIT I don't receive the voice packets at the same time, it's not simultaneously, more "overlapping".

Leif Gruenwoldt
  • 13,561
  • 5
  • 60
  • 64
CreativeMD
  • 45
  • 1
  • 5

3 Answers3

4

Apparently Java's Mixer interface was not designed for this.

http://docs.oracle.com/javase/7/docs/api/javax/sound/sampled/Mixer.html:

A mixer is an audio device with one or more lines. It need not be designed for mixing audio signals.

And indeed, when I try to open multiple lines on the same mixer this fails with a LineUnavailableException. However if all your audio recordings have the same audio format it's quite easy to manually mix them together. For example if you have 2 inputs:

  1. Convert both to the appropriate data type (for example byte[] for 8 bit audio, short[] for 16 bit, float[] for 32 bit floating point etc)
  2. Sum them in another array. Make sure summed values do not exceed the range of the datatype.
  3. Convert output back to bytes and write that to the SourceDataLine

See also How is audio represented with numbers?

Here's a sample mixing down 2 recordings and outputting as 1 signal, all in 16bit 48Khz stereo.

    // print all devices (both input and output)
    int i = 0;
    Mixer.Info[] infos = AudioSystem.getMixerInfo();
    for (Mixer.Info info : infos)
        System.out.println(i++ + ": " + info.getName());

    // select 2 inputs and 1 output
    System.out.println("Select input 1: ");
    int in1Index = Integer.parseInt(System.console().readLine());
    System.out.println("Select input 2: ");
    int in2Index = Integer.parseInt(System.console().readLine());
    System.out.println("Select output: ");
    int outIndex = Integer.parseInt(System.console().readLine());

    // ugly java sound api stuff
    try (Mixer in1Mixer = AudioSystem.getMixer(infos[in1Index]);
            Mixer in2Mixer = AudioSystem.getMixer(infos[in2Index]);
            Mixer outMixer = AudioSystem.getMixer(infos[outIndex])) {
        in1Mixer.open();
        in2Mixer.open();
        outMixer.open();
        try (TargetDataLine in1Line = (TargetDataLine) in1Mixer.getLine(in1Mixer.getTargetLineInfo()[0]);
                TargetDataLine in2Line = (TargetDataLine) in2Mixer.getLine(in2Mixer.getTargetLineInfo()[0]);
                SourceDataLine outLine = (SourceDataLine) outMixer.getLine(outMixer.getSourceLineInfo()[0])) {

            // audio format 48khz 16 bit stereo (signed litte endian)
            AudioFormat format = new AudioFormat(48000.0f, 16, 2, true, false);

            // 4 bytes per frame (16 bit samples stereo)
            int frameSize = 4;
            int bufferSize = 4800;
            int bufferBytes = frameSize * bufferSize;

            // buffers for java audio
            byte[] in1Bytes = new byte[bufferBytes];
            byte[] in2Bytes = new byte[bufferBytes];
            byte[] outBytes = new byte[bufferBytes];

            // buffers for mixing
            short[] in1Samples = new short[bufferBytes / 2];
            short[] in2Samples = new short[bufferBytes / 2];
            short[] outSamples = new short[bufferBytes / 2];

            // how long to record & play
            int framesProcessed = 0;
            int durationSeconds = 10;
            int durationFrames = (int) (durationSeconds * format.getSampleRate());

            // open devices
            in1Line.open(format, bufferBytes);
            in2Line.open(format, bufferBytes);
            outLine.open(format, bufferBytes);
            in1Line.start();
            in2Line.start();
            outLine.start();

            // start audio loop
            while (framesProcessed < durationFrames) {

                // record audio
                in1Line.read(in1Bytes, 0, bufferBytes);
                in2Line.read(in2Bytes, 0, bufferBytes);

                // convert input bytes to samples
                ByteBuffer.wrap(in1Bytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(in1Samples);
                ByteBuffer.wrap(in2Bytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(in2Samples);

                // mix samples - lower volume by 50% since we're mixing 2 streams
                for (int s = 0; s < bufferBytes / 2; s++)
                    outSamples[s] = (short) ((in1Samples[s] + in2Samples[s]) * 0.5);

                // convert output samples to bytes
                ByteBuffer.wrap(outBytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().put(outSamples);

                // play audio
                outLine.write(outBytes, 0, bufferBytes);

                framesProcessed += bufferBytes / frameSize;
            }

            in1Line.stop();
            in2Line.stop();
            outLine.stop();
        }
    }
Community
  • 1
  • 1
Sjoerd van Kreel
  • 1,000
  • 6
  • 19
  • "be sure to divide sample values by the number of sources" - usually just a constant is used (e.g. 1/3). That is because audio waves tend to cancel each other and dividing the signal makes it silent. – Pavel Horal Oct 09 '14 at 21:27
  • 1
    Good catch. Replaced by "summed values do not exceed range of the datatype". See also here http://dsp.stackexchange.com/questions/3581/algorithms-to-mix-audio-signals-without-clipping for some further discussion. – Sjoerd van Kreel Oct 10 '14 at 06:21
  • Thank you, but i don't know if i can use something like that, because the client receives the "voice" packets of every player no at the same time :(, i think this make the problem more complicated, doesn't it? – CreativeMD Oct 11 '14 at 09:21
  • More complicated yes, but you can still use the same principles. Think of it as an ever-present (infinite) audio stream which contains mostly silence but to which you ocassionally mix in additional voices as they come in. I'll try and code up an example for you later. – Sjoerd van Kreel Oct 13 '14 at 15:54
2

Allright, I put something together which should get you started. I'll post the full code below but I'll first try and explain the steps involved.

The interesting part here is to create you're own audio "mixer" class which allows consumers of that class to schedule audio blocks at specific points in the (near) future. The specific-point-in-time part is important here: i'm assuming you receive network voices in packets where each packet needs to start exactly at the end of the previous one in order to play back a continuous sound for a single voice. Also since you say voices can overlap I'm assuming (yes, lots of assumptions) a new one can come in over the network while one or more old ones are still playing. So it seems reasonable to allow audio blocks to be scheduled from any thread. Note that there's only one thread actually writing to the dataline, it's just that any thread can submit audio packets to the mixer.

So for the submit-audio-packet part we now have this:

private final ConcurrentLinkedQueue<QueuedBlock> scheduledBlocks;
public void mix(long when, short[] block) {
    scheduledBlocks.add(new QueuedBlock(when, Arrays.copyOf(block, block.length)));
}

The QueuedBlock class is just used to tag a byte array (the audio buffer) with the "when": the point in time where the block should be played.

Points in time are expressed relative to the current position of the audio stream. It is set to zero when the stream is created and updated with the buffer size each time an audio buffer is written to the dataline:

private final AtomicLong position = new AtomicLong();
public long position() {
    return position.get();
}

Apart from all the hassle to set up the data line, the interesting part of the mixer class is obviously where the mixdown happens. For each scheduled audio block, it's split up into 3 cases:

  • The block is already played in it's entirety. Remove from the scheduledBlocks list.
  • The block is scheduled to start at some point in time after the current buffer. Do nothing.
  • (Part of) the block should be mixed down into the current buffer. Note that the beginning of the block may (or may not) be already played in previous buffer(s). Similarly, the end of the scheduled block may exceed the end of the current buffer in which case we mix down the first part of it and leave the rest for the next round, untill all of it has been played an the entire block is removed.

Also note that there's no reliable way to start playing audio data immediately, when you submit packets to the mixer be sure to always have them start at least the duration of 1 audio buffer from now otherwise you'll risk losing the beginning of your sound. Here's the mixdown code:

    private static final double MIXDOWN_VOLUME = 1.0 / NUM_PRODUCERS;

    private final List<QueuedBlock> finished = new ArrayList<>();
    private final short[] mixBuffer = new short[BUFFER_SIZE_FRAMES * CHANNELS];
    private final byte[] audioBuffer = new byte[BUFFER_SIZE_FRAMES * CHANNELS * 2];
    private final AtomicLong position = new AtomicLong();

    Arrays.fill(mixBuffer, (short) 0);
    long bufferStartAt = position.get();
    for (QueuedBlock block : scheduledBlocks) {
        int blockFrames = block.data.length / CHANNELS;

        // block fully played - mark for deletion
        if (block.when + blockFrames <= bufferStartAt) {
            finished.add(block);
            continue;
        }

        // block starts after end of current buffer
        if (bufferStartAt + BUFFER_SIZE_FRAMES <= block.when)
            continue;

        // mix in part of the block which overlaps current buffer
        int blockOffset = Math.max(0, (int) (bufferStartAt - block.when));
        int blockMaxFrames = blockFrames - blockOffset;
        int bufferOffset = Math.max(0, (int) (block.when - bufferStartAt));
        int bufferMaxFrames = BUFFER_SIZE_FRAMES - bufferOffset;
        for (int f = 0; f < blockMaxFrames && f < bufferMaxFrames; f++)
            for (int c = 0; c < CHANNELS; c++) {
                int bufferIndex = (bufferOffset + f) * CHANNELS + c;
                int blockIndex = (blockOffset + f) * CHANNELS + c;
                mixBuffer[bufferIndex] += (short)
                    (block.data[blockIndex]*MIXDOWN_VOLUME);
            }
    }

    scheduledBlocks.removeAll(finished);
    finished.clear();
    ByteBuffer
        .wrap(audioBuffer)
        .order(ByteOrder.LITTLE_ENDIAN)
        .asShortBuffer()
        .put(mixBuffer);
    line.write(audioBuffer, 0, audioBuffer.length);
    position.addAndGet(BUFFER_SIZE_FRAMES);

And finally a complete, self-contained sample which spawns a number of threads submitting audio blocks representing sinewaves of random duration and frequency to the mixer (called AudioConsumer in this sample). Replace sinewaves by incoming network packets and you should be halfway to a solution.

package test;

import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicLong;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.Line;
import javax.sound.sampled.Mixer;
import javax.sound.sampled.SourceDataLine;

public class Test {

public static final int CHANNELS = 2;
public static final int SAMPLE_RATE = 48000;
public static final int NUM_PRODUCERS = 10;
public static final int BUFFER_SIZE_FRAMES = 4800;

// generates some random sine wave
public static class ToneGenerator {

    private static final double[] NOTES = {261.63, 311.13, 392.00};
    private static final double[] OCTAVES = {1.0, 2.0, 4.0, 8.0};
    private static final double[] LENGTHS = {0.05, 0.25, 1.0, 2.5, 5.0};

    private double phase;
    private int framesProcessed;
    private final double length;
    private final double frequency;

    public ToneGenerator() {
        ThreadLocalRandom rand = ThreadLocalRandom.current();
        length = LENGTHS[rand.nextInt(LENGTHS.length)];
        frequency = NOTES[rand.nextInt(NOTES.length)] * OCTAVES[rand.nextInt(OCTAVES.length)];
    }

    // make sound
    public void fill(short[] block) {
        for (int f = 0; f < block.length / CHANNELS; f++) {
            double sample = Math.sin(phase * 2.0 * Math.PI);
            for (int c = 0; c < CHANNELS; c++)
                block[f * CHANNELS + c] = (short) (sample * Short.MAX_VALUE);
            phase += frequency / SAMPLE_RATE;
        }
        framesProcessed += block.length / CHANNELS;
    }

    // true if length of tone has been generated
    public boolean done() {
        return framesProcessed >= length * SAMPLE_RATE;
    }
}

// dummy audio producer, based on sinewave generator
// above but could also be incoming network packets
public static class AudioProducer {

    final Thread thread;
    final AudioConsumer consumer;
    final short[] buffer = new short[BUFFER_SIZE_FRAMES * CHANNELS];

    public AudioProducer(AudioConsumer consumer) {
        this.consumer = consumer;
        thread = new Thread(() -> run());
        thread.setDaemon(true);
    }

    public void start() {
        thread.start();
    }

    // repeatedly play random sine and sleep for some time
    void run() {
        try {
            ThreadLocalRandom rand = ThreadLocalRandom.current();
            while (true) {
                long pos = consumer.position();
                ToneGenerator g = new ToneGenerator();

                // if we schedule at current buffer position, first part of the tone will be
                // missed so have tone start somewhere in the middle of the next buffer
                pos += BUFFER_SIZE_FRAMES + rand.nextInt(BUFFER_SIZE_FRAMES);
                while (!g.done()) {
                    g.fill(buffer);
                    consumer.mix(pos, buffer);
                    pos += BUFFER_SIZE_FRAMES;

                    // we can generate audio faster than it's played
                    // sleep a while to compensate - this more closely
                    // corresponds to playing audio coming in over the network
                    double bufferLengthMillis = BUFFER_SIZE_FRAMES * 1000.0 / SAMPLE_RATE;
                    Thread.sleep((int) (bufferLengthMillis * 0.9));
                }

                // sleep a while in between tones
                Thread.sleep(1000 + rand.nextInt(2000));
            }
        } catch (Throwable t) {
            System.out.println(t.getMessage());
            t.printStackTrace();
        }
    }
}

// audio consumer - plays continuously on a background
// thread, allows audio to be mixed in from arbitrary threads
public static class AudioConsumer {

    // audio block with "when to play" tag
    private static class QueuedBlock {

        final long when;
        final short[] data;

        public QueuedBlock(long when, short[] data) {
            this.when = when;
            this.data = data;
        }
    }

    // need not normally be so low but in this example
    // we're mixing down a bunch of full scale sinewaves
    private static final double MIXDOWN_VOLUME = 1.0 / NUM_PRODUCERS;

    private final List<QueuedBlock> finished = new ArrayList<>();
    private final short[] mixBuffer = new short[BUFFER_SIZE_FRAMES * CHANNELS];
    private final byte[] audioBuffer = new byte[BUFFER_SIZE_FRAMES * CHANNELS * 2];

    private final Thread thread;
    private final AtomicLong position = new AtomicLong();
    private final AtomicBoolean running = new AtomicBoolean(true);
    private final ConcurrentLinkedQueue<QueuedBlock> scheduledBlocks = new ConcurrentLinkedQueue<>();


    public AudioConsumer() {
        thread = new Thread(() -> run());
    }

    public void start() {
        thread.start();
    }

    public void stop() {
        running.set(false);
    }

    // gets the play cursor. note - this is not accurate and 
    // must only be used to schedule blocks relative to other blocks
    // (e.g., for splitting up continuous sounds into multiple blocks)
    public long position() {
        return position.get();
    }

    // put copy of audio block into queue so we don't
    // have to worry about caller messing with it afterwards
    public void mix(long when, short[] block) {
        scheduledBlocks.add(new QueuedBlock(when, Arrays.copyOf(block, block.length)));
    }

    // better hope mixer 0, line 0 is output
    private void run() {
        Mixer.Info[] mixerInfo = AudioSystem.getMixerInfo();
        try (Mixer mixer = AudioSystem.getMixer(mixerInfo[0])) {
            Line.Info[] lineInfo = mixer.getSourceLineInfo();
            try (SourceDataLine line = (SourceDataLine) mixer.getLine(lineInfo[0])) {
                line.open(new AudioFormat(SAMPLE_RATE, 16, CHANNELS, true, false), BUFFER_SIZE_FRAMES);
                line.start();
                while (running.get())
                    processSingleBuffer(line);
                line.stop();
            }
        } catch (Throwable t) {
            System.out.println(t.getMessage());
            t.printStackTrace();
        }
    }

    // mix down single buffer and offer to the audio device
    private void processSingleBuffer(SourceDataLine line) {

        Arrays.fill(mixBuffer, (short) 0);
        long bufferStartAt = position.get();

        // mixdown audio blocks
        for (QueuedBlock block : scheduledBlocks) {

            int blockFrames = block.data.length / CHANNELS;

            // block fully played - mark for deletion
            if (block.when + blockFrames <= bufferStartAt) {
                finished.add(block);
                continue;
            }

            // block starts after end of current buffer
            if (bufferStartAt + BUFFER_SIZE_FRAMES <= block.when)
                continue;

            // mix in part of the block which overlaps current buffer
            // note that block may have already started in the past
            // but extends into the current buffer, or that it starts
            // in the future but before the end of the current buffer
            int blockOffset = Math.max(0, (int) (bufferStartAt - block.when));
            int blockMaxFrames = blockFrames - blockOffset;
            int bufferOffset = Math.max(0, (int) (block.when - bufferStartAt));
            int bufferMaxFrames = BUFFER_SIZE_FRAMES - bufferOffset;
            for (int f = 0; f < blockMaxFrames && f < bufferMaxFrames; f++)
                for (int c = 0; c < CHANNELS; c++) {
                    int bufferIndex = (bufferOffset + f) * CHANNELS + c;
                    int blockIndex = (blockOffset + f) * CHANNELS + c;
                    mixBuffer[bufferIndex] += (short) (block.data[blockIndex] * MIXDOWN_VOLUME);
                }
        }

        scheduledBlocks.removeAll(finished);
        finished.clear();
        ByteBuffer.wrap(audioBuffer).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().put(mixBuffer);
        line.write(audioBuffer, 0, audioBuffer.length);
        position.addAndGet(BUFFER_SIZE_FRAMES);
    }
}

public static void main(String[] args) {

    System.out.print("Press return to exit...");
    AudioConsumer consumer = new AudioConsumer();
    consumer.start();
    for (int i = 0; i < NUM_PRODUCERS; i++)
        new AudioProducer(consumer).start();
    System.console().readLine();
    consumer.stop();
}
}
Sjoerd van Kreel
  • 1,000
  • 6
  • 19
  • Awesome, Thank you very much :D, i will test this and tell you if it worked out. Nice commented and explained, thank you again :D. – CreativeMD Oct 13 '14 at 20:53
  • So finally, i had the time to include in my project (sorry that this took me so long, was busy). I had some troubles to convert my byte[] into a short[], could you help me out? – CreativeMD Oct 22 '14 at 13:02
  • Sure, what exactly isn't working? Using ByteBuffer you should be able to convert between byte[] and any primitive array. – Sjoerd van Kreel Oct 24 '14 at 10:21
  • Oh :D and i thought i have to add some complex calculation to convert it. I will test this hopefully it works fine and thanks a lot :D. "ByteBuffer.wrap(data).asShortBuffer().array()" – CreativeMD Oct 24 '14 at 17:17
  • *****, i get this error: java.lang.UnsupportedOperationException at java.nio.ShortBuffer.array(Unknown Source) using this code: "ByteBuffer.wrap(data).asShortBuffer().array()". Could you help me out? – CreativeMD Oct 27 '14 at 19:00
  • http://docs.oracle.com/javase/7/docs/api/java/nio/ShortBuffer.html#array%28%29 "UnsupportedOperationException - If this buffer is not backed by an accessible array". Are are trying to get a short[] out of a byte[] this way? If so, you should be using 2 arrays: one short[], one byte[], then use ByteBuffer.wrap(byte_array).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().put(short_array) to write shorts to bytes, and use ByteBuffer.wrap(byte_array).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(short_array) to write bytes to shorts. – Sjoerd van Kreel Oct 28 '14 at 18:35
  • Sorry that i didn't answered, didn't had the time to do some better search. This code is not running. Sjoerd do you have skype or something like that. Maybe you could find the problem? Sorry again that this took me so long :( – CreativeMD Jan 11 '15 at 15:53
  • How is it not running? Are you running the sample i gave you or did you modify it in any way? Please post your code and the error you're experiencing. – Sjoerd van Kreel Feb 09 '15 at 21:46
  • Here is my complete source: https://www.dropbox.com/s/x9vbcj5zew2l01j/VoiceChat%20Source.zip?dl=0 The important file is com.creativemd.voicechat.client.AudioPacket and executeClient(EntityPlayer player) should add the audio data to the consumer. Do you have skype or something like this? – CreativeMD Feb 12 '15 at 21:47
  • Can you post something that compiles and runs out of the box? Provide a minimal example that demonstrates your problem or at the very least include all libraries necessary to run your program. – Sjoerd van Kreel Feb 14 '15 at 21:58
  • ehm, almost impossible, it's a mod for minecraft, could give you a tutorial how to install it, but ... yes. A Chat-Program would help a lot. – CreativeMD Feb 14 '15 at 22:29
0

You can use the Tritontus library to do software audio mixing (it's old but still works quite well).

Add the dependency to your project:

<dependency>
    <groupId>com.googlecode.soundlibs</groupId>
    <artifactId>tritonus-all</artifactId>
    <version>0.3.7.2</version>
</dependency>

Use the org.tritonus.share.sampled.FloatSampleBuffer. Both buffers must be of same AudioFormat before calling #mix.

// TODO instantiate these variables with real data
byte[] audio1, audio2;
AudioFormat af1, af2;
SourceDataLine sdl = AudioSystem.getSourceDataLine(af1);

FloatSampleBuffer fsb1 = new FloatSampleBuffer(audio1, 0, audio1.length, af1.getFormat());
FloatSampleBuffer fsb2 = new FloatSampleBuffer(audio2, 0, audio2.length, af2.getFormat());

fsb1.mix(fsb2);
byte[] result = fsb1.convertToByteArray(af1);

sdl.write(result, 0, result.length); // play it
Leif Gruenwoldt
  • 13,561
  • 5
  • 60
  • 64