0

I want to ask some advices about realtime audio data processing. For the moment, I created a simple server and client using python sockets which send and receive audio data from microphone until I stop it (4096 bytes for each packet, but could be much more).

I saw two kinds of different analysis:

  • realtime: perform analysis on each X bytes packet and send back result in response
  • after receiving a lot of bytes (for example every 1h), append these bytes and store them into a DB. When the microphone is stopped, concatenate all the previous chunk and perform some actions on it (like create a waveplot image for this recorded session).

For this kind of usage, which kind of selfhosted DB can I use ?

how can I concatenate these large volumes of data at regular intervals and add them to the DB ?

For only 6 minutes, I received something like 32MB of data. Maybe I should put each chunk in a redis as soon as I receipt it, rather than keeping it in a python object. Another way could be serialize audio data into b64. I'm just afraid of losing speed since I'm currently using tcp for sending data.

Thanks for your help !

Jean Ooo
  • 15
  • 5

1 Answers1

0

On your question about the size. Is there any reason not to compress the audio data? It's very easy. 32 MB for 6 mins of uncompressed audio (mono) is normal. You could Store smaller chunks and/or append incoming chunks to a bigger file. Have a look at this, it might help you:

https://realpython.com/playing-and-recording-sound-python/

How to join two wav files using python?

  • Thanks for your links. By compressing, do you mean in a wav type audio file format or gzip, tar and so on ? About the second link, I don't want to deal with files in filesystem, since I will get them with API in the future, I would like to keep them in a database – Jean Ooo Aug 22 '20 at 19:18
  • I mean something like mp3 ;). Wav and FLAC for example are uncompressed and they take up a lot of space. When I record a live gig its not unusual to have 10-30 Gigabytes of uncompressed audio and size is often underestimanted. If high fidelity is not relevant (for speech recognition for example) you could try to go from 48 or 44.1khz down to 16khz. Stay at 16bit though. – Arjaan Auinger Aug 22 '20 at 19:24