0

I am reading bytes from a recorded audio sample. I would like to convert the bytes from the frames variable into a .wav file which I want to be stored in a variable so I can access it without storing it in a file. The code below just stores the recorded data into a variable called frames.

from playsound import playsound
from random import randrange
import pyttsx3
from datetime import datetime
import pyaudio
import speech_recognition as sr
import requests
import wave
import numpy as np
import sounddevice as sd
import math
import time
import os
import struct
def voiceDetection():
   SoundThreshHold = 50
   TimeoutLength = 5 
   chunk = 1024 
   FORMAT = pyaudio.paInt16 
   CHANNELS = 2 
   RATE = 16000 
   def rms(data): 
      count = len(data)/2
      format = "%dh"%(count)
      shorts = struct.unpack( format, data )
      sum_squares = 0.0
      for sample in shorts:
          n = sample * (1.0/32768)
          sum_squares += n*n
      return math.sqrt( sum_squares / count)*1000
   p = pyaudio.PyAudio()
   stream = p.open(format=FORMAT,
                    channels=CHANNELS,
                    rate=RATE,
                    input=True,
                    frames_per_buffer=chunk)
   currentTime = time.time()
   end = time.time() + TimeoutLength
   frames = []
   while currentTime < end:
      currentTime = time.time()
      data = stream.read(chunk)
      if rms(data) >= SoundThreshHold:
         end = time.time() + TimeoutLength
         frames.append(data)      
   stream.stop_stream()
   stream.close()
   p.terminate()
   return frames
print(voiceDetection())    

Would appreciate any help. Have a happy new year!

1 Answers1

0

Python has a general mechanism for this BytesIO.

BytesIO allows you to create an in-memory file stream that you can read and write to as if it were a file on the file system.

If you just want to get your data as an array, this question has a solution

In general, when you are working with sound/numerical data in Python, you'll want to find out how to get your data in to a NumPy array in order to process it. Most libraries/tool-kits will work with NumPy arrays.

pknodle
  • 386
  • 1
  • 12
  • Alright, that sounds great, but could you show me how you would do it and explain your solution, please? I'm kind of new to the audio aspect of python and I generally don't use any of these things. Thanks –  Jan 01 '21 at 22:13
  • There are a couple ways to think about it. The data you really care about is an array of samples from an audio source. These are the blue dots in the first figure in this link: (https://en.wikipedia.org/wiki/Digital_audio) – pknodle Jan 01 '21 at 22:19
  • I am assuming that is the value I have stored in frames right? –  Jan 01 '21 at 22:21
  • A wav file is just a way to store data on the disk. In the code you provided, "data = stream.read(chunk)" is already storing the audio data. From there you can store it in another variable. – pknodle Jan 01 '21 at 22:21
  • Alright, that makes sense, but how would I listen to the audio that I stored? –  Jan 01 '21 at 22:23
  • Basically, I just want to pass this recording to a speech recognizer. Could you please tell me how I could do this? Thank you –  Jan 01 '21 at 22:24