2

I'm needing to capture the raw data (every few miliseconds) that the microphone provides. For preference on Python, but it can be in C/C++ too. I'm using Linux/macOS.

How do I capture the audio wave (microphone input) and what kind of data it will be? Pure bytes? An array with some data?

I want to make real time maginitude analysis and (if magnitude reachs a determined value) real time fft of the microphone signal, but I don't know the concepts about what data and how much data the microphone provides me.

I see a lot of code that sets to capture 44.1kHz of the audio, but does it capture all this data? The portion of data taken depends of how it was programmed?

denisb411
  • 581
  • 1
  • 7
  • 25

1 Answers1

4

"I'm needing to capture the raw data (every few milliseconds) that the microphone provides"

No, you don't. That wouldn't work. Even if you captured that data every millisecond, at exactly a multiple of 1000 microseconds (no jitter), you would have an audio quality that's utterly horrible. A sample frequency of 1000 Hz (once per millisecond) limits the Nyquist frequency to 500 Hz. That's horribly low.

"I want to make real time maginitude analysis". Well, you're ignoring the magnitude of components above 500 Hz, which is about 98% of the audible frequencies.

"real time fft" - same problem, that too would miss 98%.

You can't handle raw audio like that. You must rely on the sound card to do the heavy lifting, to get the timing rights. It can sample sounds every 21 microseconds, with microsecond accuracy. You can talk to the audio card using ALSA or PulseAudio, or a few other options (that's sound on Linux for you). But recommendations there would be off-topic.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • Thanks a lot for your answer. I'm just having a bad time at understanding how the microphone works and how can I get 'high quality' data. Do you have any article talking about this so I can learn how it works? – denisb411 Apr 26 '17 at 13:15
  • I'm reading about buffer size, and it in most cases is 1024. What does it means? – denisb411 Apr 26 '17 at 13:27
  • 1
    It means there's a buffer, 1024 elements long, each element is probably a sample. – Colin Apr 26 '17 at 13:45
  • @Colin__s each element is a wave? 1024 elements per second? I'm not understanding. – denisb411 Apr 26 '17 at 16:48
  • Regardless of the wording he used, it seems pretty obvious to me that he wants to read the samples delivered by the sound card 'every few milliseconds' rather than sampling himself the microphone signal level at that rate! – Atrag Apr 26 '17 at 20:45
  • That being the case, he can safely ignore this response. On a side note: It is not true that he would 'ignore' frequencies above the nyquist frequency, as they would be there in the form of aliasing products. – Atrag Apr 26 '17 at 20:46