1

I'm using Pyaudio to record and extract data from a recorded sound. Right now I record a sound and show it with matplotlib.

import pyaudio,numpy
import matplotlib.pyplot as plt

FORMAT = pyaudio.paFloat32
SAMPLEFREQ = 44100
FRAMESIZE = 1024
NOFFRAMES = 220
p = pyaudio.PyAudio()
print('running')

stream = p.open(format=FORMAT,channels=1,rate=SAMPLEFREQ,input=True,frames_per_buffer=FRAMESIZE)
data = stream.read(NOFFRAMES*FRAMESIZE)
decoded = numpy.fromstring(data, 'Float32')
for x in decoded:
    if x != 0.0:   #
        print (x)  #--- decoded is very huge, I just print the first float number
        break      #


stream.stop_stream()
stream.close()
p.terminate()
print('done')
plt.plot(decoded)
plt.show()

An example output of this code is;

enter image description here

My main goal is to figure out that float numbers in decoded and turn them to a string. For example, I want to detect if I record aaa, I want to process the data of that recorded data and convert it to aaa at the end. decoded is a huge list of float numbers, so I couldn't find a way to work on it. I'm open for suggestions about libraries and what is the correct algorithm for this goal.

In my opinion I'm using wrong library, but couldn't find the correct library/way for my goal.

GLHF
  • 3,835
  • 10
  • 38
  • 83

1 Answers1

1

It sounds like your asking for advice on using python to do 'Speech (audio) to Text (string)' conversions. There are some great API's and python libraries for performing speech to text conversions:

Getting started with speech recognition and python

Pygrs

SpeechRecognition 3.4.6

Community
  • 1
  • 1
Jack
  • 242
  • 4
  • 18