How to recognise a sound ‘peak’ at live system sound?

Question

I want to make a program, which does a specific command when the system’s basic sound plays any type of sound. Like if you receive a message on facebook, you got a little alarm sound. I want to recognise this ‘peak’. How is it possible in python?

This is also OS-dependent as I imagine you would have to dig into some lower-level soundcard stuff — MoxieBall, Jul 10 '18 at 20:31
@MoxieBall It can be easier if we examine a specified program? — Frank Conrad, Jul 10 '18 at 20:33
@FrankConrad More like, the way you have to do this will depend on how to get sound levels from your system, which will depend on your system. I don't know how to do that for any system -- someone who did would have to know what system you're using — MoxieBall, Jul 10 '18 at 20:38

score 0 · Answer 1 · answered Jul 11 '18 at 12:18

< /Hey >

Getting your audio data

I think what you are looking for is someway to loopback the system output so that you can read it as if your OS thinks its an input. There are different ways of doing this (depending on your OS).

However since in the comments you mentioned your OS is Windows 8.1, you can use a fork of PyAudio -> PyAudio_portaudio : Which is the normal PyAudio but extended to use the WASAPI to loopback your windows system output back into something you can retreive in Python.

Please see this other SO post on recording your system output with Python, if I missed anything and thanks to @mate for posting the link to the PyAudio fork.

This is a quick explanation:

The official PyAudio build isn't able to record the output. BUT with Windows Vista and above, a new API, WASAPI was introduced, which includes the ability to open a stream to an output device in loopback mode. In this mode the stream will behave like an input stream, with the ability to record the outgoing audio stream.

To set the mode, one has to set a special flag (AUDCLNT_STREAMFLAGS_LOOPBACK, https://msdn.microsoft.com/de-de/library/windows/desktop/dd316551(v=vs.85).aspx ). Since this flag is not supported in the official build one needs to edit PortAudio as well as PyAudio, to add loopback support.

New option: "as_loopback":(true|false)

Analyzing your data

This will give you the data block by block (in the block size you specified). From there, you can do whatever DSP / Peak analysis you desire to calculate which sound has been played / has whatever properties.

Here is a quick example to get you started on peak detection in Python. For more accurate results maybe you could store the .wav files you want to recognize and perform cross correlation to see if the same .wav file was played.

Cross correlation 1D Arrays (Mono Audio)
Cross correlation 2D Arrays (Stereo Audio)

How to recognise a sound ‘peak’ at live system sound?

1 Answers1

Getting your audio data

Analyzing your data