6

So I'm currently trying to take audio from an external microphone (that's actually on a robot in this case) and stream it into Unity to be played in a scene. I'm fairly certain this audio is encoded in the mp3 format with a sample rate of 16000 Hz and a bitrate of 192 kHz.

I'm able to get this audio as a byte array (that seems to always be Little Endian) in Unity, and I'd like to convert to a float array with each value ranging from -1.0f to +1.0f so that I can use AudioClip.SetData to play it in the Unity scene. My problem is that I'm so far unable to do this.

My first attempt was based on this StackOverflow answer: create AudioClip from byte[] which uses the following function for conversion:

private float[] ConvertByteToFloat(byte[] array) {
        float[] floatArr = new float[array.Length / 4];
        for (int i = 0; i < floatArr.Length; i++) {
            if (BitConverter.IsLittleEndian) {
                Array.Reverse(array, i * 4, 4);
            }
            floatArr[i] = BitConverter.ToSingle(array, i * 4) / 0x80000000;
        }
        return floatArr;
    }

I then invoked this like so:

scaledAudio = ConvertByteToFloat(audioData);
AudioClip audioClip = AudioClip.Create("RobotAudio", scaledAudio.Length, 1, 16000, false);
audioClip.SetData(scaledAudio, 0);
AudioSource.PlayClipAtPoint(audioClip, robot.transform.position);

But the result was a lot of static, and on logging some outputs, I realized that I was getting a bunch of NaN's...

I read somewhere that mp3 audio could extracted using the BitConverter.ToInt16() function, so I changed the ConvertByteToFloat function accordingly like so:

private float[] ConvertByteToFloat16(byte[] array) {
            float[] floatArr = new float[array.Length / 2];
            for (int i = 0; i < floatArr.Length; i++) {
                if (BitConverter.IsLittleEndian) {
                    Array.Reverse(array, i * 2, 2);
                }
                floatArr[i] = (float) (BitConverter.ToInt16(array, i * 2) / 32767f);
            }
            return floatArr;
        }

[Note: the result is divided by 32767f because I read this is the maximum value that can occur and I want to scale it down to between -1.0f and 1.0f]

The numbers from this look much more promising. They are indeed all between -1.0f and 1.0f. But when I attempt to play the audio with Unity, all I hear is static.

The issue almost definitely seems to be in the conversion of the byte[] to the float[], but I could've made a mistake in setting the data or the player for the AudioClip or the AudioSource.

Any help/suggestions are MUCH appreciated!

[Additional resources: The byte[] that I got into unity comes from here: https://github.com/ros-drivers/audio_common/blob/master/audio_capture/src/audio_capture.cpp There is a related script that takes the data encoded by this capture program and plays it (https://github.com/ros-drivers/audio_common/blob/master/audio_play/src/audio_play.cpp). This works just fine - so if I could replicate the decoding functionality of the audio_play script in that second link, it seems like I'll be good to go!]

njk23
  • 77
  • 1
  • 10

3 Answers3

4

In the file you linked, it says during the setup that it encodes the data as encoded mp3 format (line number on left).

21 >> // Need to encoding or publish raw wave data
22 >> ros::param::param<std::string>("~format", _format, "mp3");

This means you have two options.

Export Wave format (Raw PCM) from your library

Change the output format from the C++ library to export a raw wave file format.

21 >> // Need to encoding or publish raw wave data
22 >> ros::param::param<std::string>("~format", _format, "wave");

Reading through the code if you change line 22's third constructor argument to "wave", it will export the data as .wav format, and will therefore not require decoding in Unity. This will require you re-compiling your C++ code if this is an option. Please note that the audio data (in wave format) will be slightly larger in memory (than mp3).

See line 98 -> 109 of the audio_capture.cpp file for where it checks wave or mp3 formatting.

Decode MP3 Audio in Unity

Otherwise you could try decode the mp3 data in Unity. This is most likely going to involve using an mp3 library (the first one I found was MP3Sharp). Otherwise there's a Unity asset called uAudio that states to do realtime mp3 compression/decompression; this might be simpler than using a generic mp3 decoder as it's already been designed for Unity.

I would not recommend writing your own mp3 decoder unless just for the sake of a challenge, or for learning purposes.


All ideas aside, my first attempt would be to re-compile your C++ library with the argument as "wave" as stated above!

I hope this helps :)

WoodyDev
  • 1,386
  • 1
  • 9
  • 19
  • Hi, this is a really cool suggestion! I think it's possible to use the WAV format. Do I have to alter my unity code in any way to deal with headers or other information encoded with WAV? – njk23 Jun 27 '18 at 15:12
  • Hi @NishanthJKumar, [the wave header's](http://soundfile.sapp.org/doc/WaveFormat/) tend to be much simpler to handle than the mp3 ones. As long as you extract the data chunk correctly this should be fine. Since I cannot see the code for how you parse the headers I cannot say, but if you are struggling to parse the wave files there are plenty of people who have built wave [parsers already](https://assetstore.unity.com/packages/tools/audio/open-wav-parser-90832). And I'm sure you will find plenty more parsers if you search :) – WoodyDev Jun 27 '18 at 15:19
0

First of all, just converting byte[] to float[] like that is only going to work if your data is 16 bit PCM. if its 16bit.

If your audio really is compressed with MPEG-1/MPEG-2 Audio Layer 3 format, getting to the stream will not be simply a matter of converting data format, it needs to be decoded (compressed first). I would try to get the sender to produce a standard, non-encoded PCM format, and your code should start to work

zambari
  • 4,797
  • 1
  • 12
  • 22
0
  1. Download this .dll: https://www.dllme.com/dll/files/naudio_dll.html and import in folder Plugins

  2. Download this c# script: https://www.dropbox.com/s/wks0ujanr0pm6nj/NAudioPlayer.cs?dl=0

  3. Download and import this Unity Asset Store: https://assetstore.unity.com/packages/tools/gui/runtime-file-browser-113006

  4. Create a c# Script, add to a gameObject, and write this lines:

    using System.Collections;
    using System.Collections.Generic;
    using UnityEngine;
    using System.IO;
    using UnityEngine.UI;
    using System.Runtime;
    using System.Runtime.InteropServices;
    using System.Runtime.Serialization.Formatters.Binary;
    using System.Runtime.Serialization;
    using NAudio;
    using NAudio.Wave;
    using UnityEngine.Networking;
    using SimpleFileBrowser;

    public class ReadMp3 : MonoBehaviour{
    private AudioSource audioSource;
    public Text pathText;

    private void Start()
    {
        audioSource = GetComponent<AudioSource>();
    }
    public void ReadMp3Sounds()
    {
        FileBrowser.SetFilters(false, new FileBrowser.Filter("Sounds", ".mp3"));
        FileBrowser.SetDefaultFilter(".mp3");
        StartCoroutine(ShowLoadDialogCoroutine());
    }

    IEnumerator ShowLoadDialogCoroutine()
    {

        yield return FileBrowser.WaitForLoadDialog(false, null, "Select Sound", "Select");

        pathText.text = FileBrowser.Result;

        if (FileBrowser.Success)
        {
            byte[] SoundFile = FileBrowserHelpers.ReadBytesFromFile(FileBrowser.Result);
            yield return SoundFile;

            audioSource.clip = NAudioPlayer.FromMp3Data(SoundFile);
            audioSource.Play();   
        }
    }
Kalana
  • 5,631
  • 7
  • 30
  • 51