11

Whenever audio is playing in windows 10, whether it is from Spotify, Firefox, or a game. When you turn the volume, windows has a thing in the corner that says the song artist, title, and what app is playing like the photo below (sometimes it only says what app is playing sound if a game is playing the sound)

enter image description here

I want to somehow get that data with python. My end goal, is to mute an application if it is playing something I don't like, such as an advertisement.

Alexander
  • 1,051
  • 1
  • 8
  • 21
  • did you manage to find out @Alexander ? – RaduS Jan 05 '21 at 12:22
  • @Radus I got the tiles of the windows, instead. The titles usually displayed something like "Spotify.exe" When not playing media, and when playing media it showed the song name, I will post the example as an answer to this question – Alexander Jan 06 '21 at 17:24

3 Answers3

26

Turns out this is possible without a workaround and by accessing this info directly using the Windows Runtime API (winrt).

All code shown uses Python 3 and the winrt library installed via pip

Collecting Media/'Now Playing' information

The following code allows for you to collect a dictionary of the media information available to Windows using the winrt wrapper for the Windows Runtime API. It does not rely on a window's title/application name changing as in the other answers here.

import asyncio

from winrt.windows.media.control import \
    GlobalSystemMediaTransportControlsSessionManager as MediaManager


async def get_media_info():
    sessions = await MediaManager.request_async()

    # This source_app_user_model_id check and if statement is optional
    # Use it if you want to only get a certain player/program's media
    # (e.g. only chrome.exe's media not any other program's).

    # To get the ID, use a breakpoint() to run sessions.get_current_session()
    # while the media you want to get is playing.
    # Then set TARGET_ID to the string this call returns.

    current_session = sessions.get_current_session()
    if current_session:  # there needs to be a media session running
        if current_session.source_app_user_model_id == TARGET_ID:
            info = await current_session.try_get_media_properties_async()

            # song_attr[0] != '_' ignores system attributes
            info_dict = {song_attr: info.__getattribute__(song_attr) for song_attr in dir(info) if song_attr[0] != '_'}

            # converts winrt vector to list
            info_dict['genres'] = list(info_dict['genres'])

            return info_dict

    # It could be possible to select a program from a list of current
    # available ones. I just haven't implemented this here for my use case.
    # See references for more information.
    raise Exception('TARGET_PROGRAM is not the current media session')


if __name__ == '__main__':
    current_media_info = asyncio.run(get_media_info())

current_media_info will be a dictionary in the following format and information can then be accessed as required within the program:

{
    'album_artist': str,
    'album_title': str,
    'album_track_count': int, 
    'artist': str,
    'genres': list,
    'playback_type': int,
    'subtitle': str, 
    'thumbnail': 
        <_winrt_Windows_Storage_Streams.IRandomAccessStreamReference object at ?>, 
    'title': str,
    'track_number': int,
}

Controlling Media

As the OP says that their end goal is to control media, this should be possible with the same libraries. See here for more information possibly (I didn't need this in my case):

(Getting Media thumbnail)

It is in fact possible to also 'scrape' the album art/media thumbnail (displayed on the right in the OP's screenshot) of the media currently playing (although the OP didn't ask for this but someone might want to do it):

from winrt.windows.storage.streams import \
    DataReader, Buffer, InputStreamOptions


async def read_stream_into_buffer(stream_ref, buffer):
    readable_stream = await stream_ref.open_read_async()
    readable_stream.read_async(buffer, buffer.capacity, InputStreamOptions.READ_AHEAD)


# create the current_media_info dict with the earlier code first
thumb_stream_ref = current_media_info['thumbnail']

# 5MB (5 million byte) buffer - thumbnail unlikely to be larger
thumb_read_buffer = Buffer(5000000)

# copies data from data stream reference into buffer created above
asyncio.run(read_stream_into_buffer(thumb_stream_ref, thumb_read_buffer))

# reads data (as bytes) from buffer
buffer_reader = DataReader.from_buffer(thumb_read_buffer)
byte_buffer = buffer_reader.read_bytes(thumb_read_buffer.length)

with open('media_thumb.jpg', 'wb+') as fobj:
    fobj.write(bytearray(byte_buffer))

This will save a media_thumb.jpg to the current working directory (cwd) which can then be used elsewhere for whatever.

Docs & References:

Potentially chose from multiple available media streams?

Please note that I haven't tested or tried this and is merely a pointer for anyone who may want to experiment:

As opposed to current use of

tameTNT
  • 376
  • 3
  • 6
  • 2
    You'll have to replace the `winrt` module with `winsdk` since python 3.10.0 according to the answer here: https://stackoverflow.com/questions/69610231/why-cant-pip-find-winrt. Furthermore, this should be chosen answer. The respondant is answering to the original question, using the Windows RunTime API, not using the Win32 API in a bare way, to capture the window title of the running player. – digfish May 14 '23 at 18:09
3

I am getting the titles of the windows to get the song information. Usually, the application name is displayed in the title, but when it is playing a song, the song name is shown. Here is a function that returns a list of all the window titles.

from __future__ import print_function
import ctypes
def get_titles(): 
    EnumWindows = ctypes.windll.user32.EnumWindows
    EnumWindowsProc = ctypes.WINFUNCTYPE(ctypes.c_bool, ctypes.POINTER(ctypes.c_int), ctypes.POINTER(ctypes.c_int))
    GetWindowText = ctypes.windll.user32.GetWindowTextW
    GetWindowTextLength = ctypes.windll.user32.GetWindowTextLengthW
    IsWindowVisible = ctypes.windll.user32.IsWindowVisible
    
    titles = []
    def foreach_window(hwnd, lParam):
        if IsWindowVisible(hwnd):
            length = GetWindowTextLength(hwnd)
            buff = ctypes.create_unicode_buffer(length + 1)
            GetWindowText(hwnd, buff, length + 1)
            titles.append(buff.value)
        return True
    EnumWindows(EnumWindowsProc(foreach_window), 0)
    return titles
Alexander
  • 1,051
  • 1
  • 8
  • 21
0

I don't have an example of how to do it in python, instead I can point you in the right direction. What you want is to use winrt python module https://github.com/microsoft/xlang/blob/master/src/package/pywinrt/projection/readme.md to access WinRT api https://learn.microsoft.com/en-us/uwp/api/windows.media?view=winrt-19041 . There is not a lot of documentation around winrt python module so you might have to dig around the win rt api documentation to figure out how. Good luck!

user7788539
  • 89
  • 1
  • 7