0

I am using javascript sample to receive real-time microphone audio and transmit it as an audio stream to a WebSocket server. I am using the following simple Python code as a WebSocket server to receive the stream and play it using PyAudio. However, I am getting only noise when playing the audio. I tested using pyaudio functionality directly with the microphone, and it works fine. But when using JavaScript, there seems to be an issue. Is there a problem with PyAudio reading this streaming format?

python webssocket server

import pyaudio
import numpy as np
import asyncio
import websockets

CHUNK_SIZE = 1024

audio = pyaudio.PyAudio()
stream = audio.open(format=pyaudio.paInt16,
                    channels=1,
                    rate=16000,
                    output=True,
                    frames_per_buffer=CHUNK_SIZE)


async def websocket_handler(websocket, path):
    try:
        async for message in websocket:
            print(message)
            audio_data = np.frombuffer(message, dtype=np.int16)
            stream.write(audio_data.tobytes())
            #stream.write(message)
    finally:
        stream.stop_stream()
        stream.close()
        audio.terminate()

async def start_websocket_server():
    async with websockets.serve(websocket_handler, '', 8888):
        await asyncio.Future()  # Keep the server running


if __name__ == '__main__':
    asyncio.run(start_websocket_server())

python webssocket client

import pyaudio
import asyncio
import websockets

def record_microphone(stream):
    CHUNK = 1024
    while True:
        data = stream.read(CHUNK)
        yield data

async def send_audio():
    async with websockets.connect('ws://xxx.xxx.x.xxx:8888') as ws:
        p = pyaudio.PyAudio()
        # obtain the index of available mic
        mic_device_index = None
        for i in range(p.get_device_count()):
            device_info = p.get_device_info_by_index(i)
            if device_info['maxInputChannels'] > 0:
                mic_device_index = i
                break

        if mic_device_index is None:
            print("there is no mic")
            return

        stream = p.open(format=pyaudio.paInt16,
                        channels=1,
                        rate=44100,
                        input=True,
                        frames_per_buffer=1024,
                        input_device_index=mic_device_index)

        for data in record_microphone(stream):
            await ws.send(data)

asyncio.get_event_loop().run_until_complete(send_audio())

javascript sample

//================= CONFIG =================
// Global Variables
let websocket_uri = 'ws://127.0.0.1:9001';
let bufferSize = 4096,
    AudioContext,
    context,
    processor,
    input,
    globalStream,
    websocket;

// Initialize WebSocket
initWebSocket();

//================= RECORDING =================
function startRecording() {
    streamStreaming = true;
    AudioContext = window.AudioContext || window.webkitAudioContext;
    context = new AudioContext({
      // if Non-interactive, use 'playback' or 'balanced' // https://developer.mozilla.org/en-US/docs/Web/API/AudioContextLatencyCategory
      latencyHint: 'interactive',
    });
    processor = context.createScriptProcessor(bufferSize, 1, 1);
    processor.connect(context.destination);
    context.resume();
  
    var handleSuccess = function (stream) {
      globalStream = stream;
      input = context.createMediaStreamSource(stream);
      input.connect(processor);
  
      processor.onaudioprocess = function (e) {
        var left = e.inputBuffer.getChannelData(0);
        var left16 = downsampleBuffer(left, 44100, 16000);
        websocket.send(left16);
      };
    };
  
    navigator.mediaDevices.getUserMedia({audio: true, video: false}).then(handleSuccess);
} // closes function startRecording()

function stopRecording() {
    streamStreaming = false;
  
    let track = globalStream.getTracks()[0];
    track.stop();
  
    input.disconnect(processor);
    processor.disconnect(context.destination);
    context.close().then(function () {
      input = null;
      processor = null;
      context = null;
      AudioContext = null;
    });
} // closes function stopRecording()

function initWebSocket() {
    // Create WebSocket
    websocket = new WebSocket(websocket_uri);
    //console.log("Websocket created...");
  
    // WebSocket Definitions: executed when triggered webSocketStatus
    websocket.onopen = function() {
      console.log("connected to server");
      //websocket.send("CONNECTED TO YOU");
      document.getElementById("webSocketStatus").innerHTML = 'Connected';
    }
    
    websocket.onclose = function(e) {
      console.log("connection closed (" + e.code + ")");
      document.getElementById("webSocketStatus").innerHTML = 'Not Connected';
    }
    
    websocket.onmessage = function(e) {
      //console.log("message received: " + e.data);
      console.log(e.data);
  
      try {
        result = JSON.parse(e.data);
      }  catch (e) {
        $('.message').html('Error retrieving data: ' + e);
      }
  
      if (typeof(result) !== 'undefined' && typeof(result.error) !== 'undefined') {
        $('.message').html('Error: ' + result.error);
      }
      else {
        $('.message').html('Welcome!');
      }
    }
} // closes function initWebSocket()

function downsampleBuffer (buffer, sampleRate, outSampleRate) {
    if (outSampleRate == sampleRate) {
      return buffer;
    }
    if (outSampleRate > sampleRate) {
      throw 'downsampling rate show be smaller than original sample rate';
    }
    var sampleRateRatio = sampleRate / outSampleRate;
    var newLength = Math.round(buffer.length / sampleRateRatio);
    var result = new Int16Array(newLength);
    var offsetResult = 0;
    var offsetBuffer = 0;
    while (offsetResult < result.length) {
      var nextOffsetBuffer = Math.round((offsetResult + 1) * sampleRateRatio);
      var accum = 0,
        count = 0;
      for (var i = offsetBuffer; i < nextOffsetBuffer && i < buffer.length; i++) {
        accum += buffer[i];
        count++;
      }
  
      result[offsetResult] = Math.min(1, accum / count) * 0x7fff;
      offsetResult++;
      offsetBuffer = nextOffsetBuffer;
    }
    return result.buffer;
} // closes function downsampleBuffer()
starball
  • 20,030
  • 7
  • 43
  • 238
Stan Ho
  • 1
  • 1
  • Sorry, originally I used a hyperlink to share the sample code in order to reduce the amount of text on the screen. I have now revised it and present the JavaScript code – Stan Ho Jul 11 '23 at 14:39
  • Thanks On a JS sidenote, this code has some oddly ancient JS in it: why are there `var` all over the place instead of `let` and `const`, and why are you not even declaring certain vars, implicitly turning them into global variables (like `streamStreaming ` or `context`)? Also, every browser supports `AudioContext `, no need to pull it from somewhere. And of course, JS conventions: python using snake_case, JS does not, except for UPPER_SNAKE_CASE when using "true" constants. This feels like it was put together using pre-modern-JS tutorials. – Mike 'Pomax' Kamermans Jul 11 '23 at 14:46
  • I apologize, I have little knowledge of JavaScript, but I need to implement a project that involves recording real-time microphone audio from a web application and playing it from a Python server. I found a template for testing, but it only produces noise when playing the audio. However, interestingly, I can save the stream as a WAV file using the Python Wave package and play it successfully with a media player. But I'm unable to use PyAudio or Aplay to play the WAV file or play it in real time. – Stan Ho Jul 11 '23 at 14:58
  • This is the template I referred to https://stackoverflow.com/questions/67118642/audiocontext-getusermedia-and-websockets-audio-streaming – Stan Ho Jul 11 '23 at 15:00
  • That is surprisingly bad JS for a 2021 answer, but then they did copy it from someone else's code from 2018, which may itself have been several years old at that point. – Mike 'Pomax' Kamermans Jul 11 '23 at 15:08
  • I understand. Do you have any other recommended solutions? – Stan Ho Jul 11 '23 at 17:17
  • I'd probably go with "use WebRTC" since that was developed specifically for real time a/v streams. (A quick google shows a bunch of tutorials for writing python+client solutions using webrtc) – Mike 'Pomax' Kamermans Jul 11 '23 at 17:32

0 Answers0