47

I would like to get the active window on the screen using python.

For example, the management interface of the router where you enter the username and password as admin

That admin interface is what I want to capture using python to automate the entry of username and password.

What imports would I require in order to do this?

geekosaur
  • 59,309
  • 11
  • 123
  • 114
Vinod K
  • 1,885
  • 11
  • 35
  • 45
  • 1
    What operating system? Are you asking about active desktop windows, or browser windows? Do you need any active window, or are you only trying to automate the management interface of your router? – Bryan Oakley Apr 22 '12 at 15:47
  • 1
    See also [ubuntu - How can I use xdotool from within a python module/script? - Stack Overflow](https://stackoverflow.com/questions/9681959/how-can-i-use-xdotool-from-within-a-python-module-script#comment123175660_9681959) there are solutions using subprocess to call xdotool the executable, and there are solutions using libxdo – user202729 Oct 23 '21 at 15:39
  • Same question on Ask Ubuntu: [python3 - Get Window Title or Application Name with Python - Ask Ubuntu](https://askubuntu.com/questions/555201/get-window-title-or-application-name-with-python) – user202729 Jun 03 '23 at 02:56

12 Answers12

39

On windows, you can use the python for windows extensions (http://sourceforge.net/projects/pywin32/):

from win32gui import GetWindowText, GetForegroundWindow
print GetWindowText(GetForegroundWindow())

Below code is for python 3:

from win32gui import GetWindowText, GetForegroundWindow
print(GetWindowText(GetForegroundWindow()))

(Found this on http://scott.sherrillmix.com/blog/programmer/active-window-logger/)

Gruber
  • 2,196
  • 5
  • 28
  • 50
  • I run this scrpit with a "shortcut" of my script (in order to call it with a keyboard shortcut). It thus print "shortcut". The active windows is a pdf opened in acrobat but when I run the script, the active windows is the shortcut (I want to get the file name of the pdf opened). Any idea how to solve this? – JinSnow Nov 23 '16 at 15:00
  • 1
    Just to let everyone know pywin32 does now have a pip install `pip install pywin32` – Yusof Bandar Jan 28 '19 at 12:35
  • This is not printing anything (Python 3 version). It just makes a newline... help please? – HackerDaGreat57 Sep 02 '22 at 20:46
26

Thanks goes to the answer by Nuno André, who showed how to use ctypes to interact with Windows APIs. I have written an example implementation using his hints.

The ctypes library is included with Python since v2.5, which means that almost every user has it. And it's a way cleaner interface than old and dead libraries like win32gui (last updated in 2017 as of this writing). ((Update in late 2020: The dead win32gui library has come back to life with a rename to pywin32, so if you want a maintained library, it's now a valid option again. But that library is 6% slower than my code.))

Documentation is here: https://docs.python.org/3/library/ctypes.html (You must read its usage help if you wanna write your own code, otherwise you can cause segmentation fault crashes, hehe.)

Basically, ctypes includes bindings for the most common Windows DLLs. Here is how you can retrieve the title of the foreground window in pure Python, with no external libraries needed! Just the built-in ctypes! :-)

The coolest thing about ctypes is that you can Google any Windows API for anything you need, and if you want to use it, you can do it via ctypes!

Python 3 Code:

from typing import Optional
from ctypes import wintypes, windll, create_unicode_buffer

def getForegroundWindowTitle() -> Optional[str]:
    hWnd = windll.user32.GetForegroundWindow()
    length = windll.user32.GetWindowTextLengthW(hWnd)
    buf = create_unicode_buffer(length + 1)
    windll.user32.GetWindowTextW(hWnd, buf, length + 1)
    
    # 1-liner alternative: return buf.value if buf.value else None
    if buf.value:
        return buf.value
    else:
        return None

Performance is extremely good: 0.01 MILLISECONDS on my computer (0.00001 seconds).

Will also work on Python 2 with very minor changes. If you're on Python 2, I think you only have to remove the type annotations (from typing import Optional and -> Optional[str]). :-)

Enjoy!

Win32 Technical Explanations:

The length variable is the length of the actual text in UTF-16 (Windows Wide "Unicode") CHARACTERS. (It is NOT the number of BYTES.) We have to add + 1 to add room for the null terminator at the end of C-style strings. If we don't do that, we would not have enough space in the buffer to fit the final real character of the actual text, and Windows would truncate the returned string (it does that to ensure that it fits the super important final string Null-terminator).

The create_unicode_buffer function allocates room for that many UTF-16 CHARACTERS.

Most (or all? always read Microsoft's MSDN docs!) Windows APIs related to Unicode text take the buffer length as CHARACTERS, NOT as bytes.

Also look closely at the function calls. Some end in W (such as GetWindowTextLengthW). This stands for "Wide string", which is the Windows name for Unicode strings. It's very important that you do those W calls to get proper Unicode strings (with international character support).

PS: Windows has been using Unicode for a long time. I know for a fact that Windows 10 is fully Unicode and only wants the W function calls. I don't know the exact cutoff date when older versions of Windows used other multi-byte string formats, but I think it was before Windows Vista, and who cares? Old Windows versions (even 7 and 8.1) are dead and unsupported by Microsoft.

Again... enjoy! :-)

UPDATE in Late 2020, Benchmark vs the pywin32 library:

import time

import win32ui

from typing import Optional
from ctypes import wintypes, windll, create_unicode_buffer

def getForegroundWindowTitle() -> Optional[str]:
    hWnd = windll.user32.GetForegroundWindow()
    length = windll.user32.GetWindowTextLengthW(hWnd)
    buf = create_unicode_buffer(length + 1)
    windll.user32.GetWindowTextW(hWnd, buf, length + 1)

    return buf.value if buf.value else None

def getForegroundWindowTitle_Win32UI() -> Optional[str]:
    # WARNING: This code sometimes throws an exception saying
    # "win32ui.error: No window is is in the foreground."
    # which is total nonsense. My function doesn't fail that way.
    return win32ui.GetForegroundWindow().GetWindowText()

iterations = 1_000_000

start_time = time.time()
for x in range(iterations):
    foo = getForegroundWindowTitle()
elapsed1 = time.time() - start_time
print("Elapsed 1:", elapsed1, "seconds")

start_time = time.time()
for x in range(iterations):
    foo = getForegroundWindowTitle_Win32UI()
elapsed2 = time.time() - start_time
print("Elapsed 2:", elapsed2, "seconds")

win32ui_pct_slower = ((elapsed2 / elapsed1) - 1) * 100
print("Win32UI library is", win32ui_pct_slower, "percent slower.")

Typical result after doing multiple runs on an AMD Ryzen 3900x:

My function: 4.5769994258880615 seconds

Win32UI library: 4.8619983196258545 seconds

Win32UI library is 6.226762715455125 percent slower.

However, the difference is small, so you may want to use the library now that it has come back to life (it had previously been dead since 2017). But you're going to have to deal with that library's weird "no window is in the foreground" exception, which my code doesn't suffer from (see the code comments in the benchmark code).

Either way... enjoy!

Mitch McMabers
  • 3,634
  • 28
  • 27
  • using perf_counter (without any precomplation) this function consistently returned 30-50% slower than using the win32gui library – David Jun 16 '20 at 08:34
  • @David That is false. My code is 6% faster than win32gui (which has come back to life and is renamed to pywin32). I have updated my post with a benchmark to compare so that you can see for yourself. Perhaps you forgot that my code does two things (getting active window and then getting the title of that window), whereas that library only does 1 action at a time and you need to call two functions in it to achieve what my code does. – Mitch McMabers Nov 16 '20 at 18:12
  • This works pretty well though. – Mujtaba Dec 11 '22 at 14:54
24

The following script should work on Linux, Windows and Mac. It is currently only tested on Linux (Ubuntu Mate Ubuntu 15.10).

Prerequisites

For Linux:

Install wnck (sudo apt-get install python-wnck on Ubuntu, see libwnck.)

For Windows:

Make sure win32gui is available

For Mac:

Make sure AppKit is available

The script

#!/usr/bin/env python

"""Find the currently active window."""

import logging
import sys

logging.basicConfig(format='%(asctime)s %(levelname)s %(message)s',
                    level=logging.DEBUG,
                    stream=sys.stdout)


def get_active_window():
    """
    Get the currently active window.

    Returns
    -------
    string :
        Name of the currently active window.
    """
    import sys
    active_window_name = None
    if sys.platform in ['linux', 'linux2']:
        # Alternatives: https://unix.stackexchange.com/q/38867/4784
        try:
            import wnck
        except ImportError:
            logging.info("wnck not installed")
            wnck = None
        if wnck is not None:
            screen = wnck.screen_get_default()
            screen.force_update()
            window = screen.get_active_window()
            if window is not None:
                pid = window.get_pid()
                with open("/proc/{pid}/cmdline".format(pid=pid)) as f:
                    active_window_name = f.read()
        else:
            try:
                from gi.repository import Gtk, Wnck
                gi = "Installed"
            except ImportError:
                logging.info("gi.repository not installed")
                gi = None
            if gi is not None:
                Gtk.init([])  # necessary if not using a Gtk.main() loop
                screen = Wnck.Screen.get_default()
                screen.force_update()  # recommended per Wnck documentation
                active_window = screen.get_active_window()
                pid = active_window.get_pid()
                with open("/proc/{pid}/cmdline".format(pid=pid)) as f:
                    active_window_name = f.read()
    elif sys.platform in ['Windows', 'win32', 'cygwin']:
        # https://stackoverflow.com/a/608814/562769
        import win32gui
        window = win32gui.GetForegroundWindow()
        active_window_name = win32gui.GetWindowText(window)
    elif sys.platform in ['Mac', 'darwin', 'os2', 'os2emx']:
        # https://stackoverflow.com/a/373310/562769
        from AppKit import NSWorkspace
        active_window_name = (NSWorkspace.sharedWorkspace()
                              .activeApplication()['NSApplicationName'])
    else:
        print("sys.platform={platform} is unknown. Please report."
              .format(platform=sys.platform))
        print(sys.version)
    return active_window_name

print("Active window: %s" % str(get_active_window()))
user202729
  • 3,358
  • 3
  • 25
  • 36
Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
  • I only tested it for Linux (Ubuntu Mate). Please add a comment if you have another system and let me know of the output. – Martin Thoma Apr 05 '16 at 07:24
  • On Lubuntu, 14.04 LTS, I get 7 of the warning `Wnck-WARNING **: Unhandled action type _OB_WM_ACTION_UNDECORATE`, apparently on the screen.force_update() line, but also get a proper result in the end. Additionally, that block seems to return the window name (e.g. __x-terminal-emulator__), by contrast from memory I would expect the Win32 to return the window caption, which is the typically dynamically changing text. – brezniczky Jun 18 '16 at 00:48
  • For wnck in Python3 see: https://stackoverflow.com/a/43349245 – jl005 Jan 29 '21 at 17:42
  • tested on Windows 11 , works fine. e.g. In PyCharm opened a project file to run it , it shows sth like ` - ` e.g. `Test - test.py` – Paul Lam Dec 29 '21 at 16:57
13

For Linux users: All the answers provided required additional modules like "wx" that had numerous errors installing ("pip" failed on build), but I was able to modify this solution quite easily -> original source. There were bugs in the original (Python TypeError on regex)

import sys
import os
import subprocess
import re

def get_active_window_title():
    root = subprocess.Popen(['xprop', '-root', '_NET_ACTIVE_WINDOW'], stdout=subprocess.PIPE)
    stdout, stderr = root.communicate()

    m = re.search(b'^_NET_ACTIVE_WINDOW.* ([\w]+)$', stdout)
    if m != None:
        window_id = m.group(1)
        window = subprocess.Popen(['xprop', '-id', window_id, 'WM_NAME'], stdout=subprocess.PIPE)
        stdout, stderr = window.communicate()
    else:
        return None

    match = re.match(b"WM_NAME\(\w+\) = (?P<name>.+)$", stdout)
    if match != None:
        return match.group("name").strip(b'"')

    return None

if __name__ == "__main__":
    print(get_active_window_title())

The advantage is it works without additional modules. If you want it to work across multiple platforms, it's just a matter of changing the command and regex strings to get the data you want based on the platform (with the standard if/else platform detection shown above sys.platform).

On a side note: import wnck only works with python2.x when installed with "sudo apt-get install python-wnck", since I was using python3.x the only option was pypie which I have not tested. Hope this helps someone else.

karuhanga
  • 3,010
  • 1
  • 27
  • 30
James Nelson
  • 833
  • 10
  • 15
13

There's really no need to import any external dependency for tasks like this. Python comes with a pretty neat foreign function interface - ctypes, which allows for calling C shared libraries natively. It even includes specific bindings for the most common Win32 DLLs.

E.g. to get the PID of the foregorund window:

import ctypes
from ctypes import wintypes

user32 = ctypes.windll.user32

h_wnd = user32.GetForegroundWindow()
pid = wintypes.DWORD()
user32.GetWindowThreadProcessId(h_wnd, ctypes.byref(pid))
print(pid.value)
Nuno André
  • 4,739
  • 1
  • 33
  • 46
  • 1
    **Thanks a lot, this is way neater than the dead win32gui library (last active in 2017).** With your code I get the same functions directly via ctypes! I'm going to post a separate answer with my own code for getting the window title of the foreground window. But credit goes to you for sharing the general method! – Mitch McMabers Oct 12 '19 at 14:03
  • 1
    @MitchMcMabers You may want to check [this implementation of some WinAPI methods](https://github.com/enthought/pywin32-ctypes). Also, if you want to call some COM objects with `ctypes`is well worth [this library](https://github.com/enthought/comtypes). – Nuno André Oct 13 '19 at 00:48
  • That `pywin32-ctypes` library is very nice and tiny and only focused on loading DLLs and resource files, which means it's not bloated. And `comtypes` is famous and vital for COM work. Good tips for people! :-) – Mitch McMabers Oct 13 '19 at 02:30
  • On Ubuntu 16.04 `ctypes` imports OK, but then using: `from ctypes import wintypes` generates error lines: `File "/usr/lib/python2.7/ctypes/wintypes.py", line 19, in ` followed by: `class VARIANT_BOOL(_SimpleCData):` and finally by: `ValueError: _type_ 'v' not supported`. I'll use the answer I just upvoted from 17 to 18 votes. – WinEunuuchs2Unix May 01 '21 at 16:25
  • can this be used to turn it into a better format(not pid number but a real name like outlook- mail)? – Augurkenplukker12 May 23 '21 at 20:35
1

In Linux under X11:

xdo_window_id = os.popen('xdotool getactivewindow').read()
print('xdo_window_id:', xdo_window_id)

will print the active window ID in decimal format:

xdo_window_id: 67113707

Note xdotool must be installed first:

sudo apt install xdotool

Note wmctrl uses hexadecimal format for window ID.

WinEunuuchs2Unix
  • 1,801
  • 1
  • 17
  • 34
1

This only works on windows

import win32gui
import win32process

 def get_active_executable_name():
        try:
            process_id = win32process.GetWindowThreadProcessId(
                win32gui.GetForegroundWindow()
            )
            return ".".join(psutil.Process(process_id[-1]).name().split(".")[:-1])
        except Exception as exception:
            return None

I'll recommend checking out this answer for making it work on linux, mac and windows.

1

In Linux:

If you already have installed xdotool, you can just use:

from subprocess import run
def get__focused_window():
    return run(['xdotool', 'getwindowfocus', 'getwindowpid', 'getwindowname'], capture_output=True).stdout.decode('utf-8').split()

While I was writing this answer I've realised that there were also:

So, I've decided to mention them here, too.

Giorgos Xou
  • 1,461
  • 1
  • 13
  • 32
1

Using python-xlib (X on Linux only)

from Xlib import X, display
display = display.Display()
window = display.get_input_focus().focus
print(window.get_wm_name())

There's a caveat.

Using xwininfo -tree -root when there's a Zathura window open shows the following:

0x4c00003 "/tmp/a.pdf": ("org.pwmt.zathura" "Zathura")  958x352+960+18  +960+18
    1 child:
    0x4c00004 (has no name): ()  1x1+-1+-1  +960+18

and when the zathura window has just opened in my window manager (xmonad) the inner window is focused.

It may be necessary to recursively traverse the parents to search for the first one with nontrivial content as follows:

window = display.get_input_focus().focus
while window!=0:
    print(window.get_wm_name())
    print(window.get_wm_class())
    window=window.query_tree().parent

Using ewmh (X on Linux only)

Same as above, but add:

from ewmh import EWMH
ewmh = EWMH(display)
print(ewmh.getWmName(window))

Personally I find this a bit more reliable than the above, but same caveat applies (you may need to traverse the parents).

Using pywinctl (cross-platform)

import pywinctl
print(pywinctl.getActiveWindowTitle())

If it does not work, you may want to find out how to enable EWMH. For example for XMonad: https://github.com/Kalmat/PyWinCtl/issues/64

I guess get_input_focus is equivalent to xdotool getwindowfocus and getActiveWindow is equivalent to xdotool getactivewindow:

$ xdotool getwindowfocus
60817420
$ xdotool getactivewindow
Your windowmanager claims not to support _NET_ACTIVE_WINDOW, so the attempt to query the active wind
ow aborted.
xdo_get_active_window reported an error

Using pyautogui (Windows only)

Same API as above.

Refer to https://stackoverflow.com/a/71357737/5267751

Note that internally this imports the function from pygetwindow anyway and at the moment it only support Windows.

Using pygetwindow (Windows only)

Also mostly same API as above.

import pygetwindow as gw
gw.getActiveWindow().title
user202729
  • 3,358
  • 3
  • 25
  • 36
0

I'd been facing same problem with linux interface (Lubuntu 20). What I do is using wmctrl and execute it with shell command from python.

First, Install wmctrl sudo apt install wmctrl

Then, Add this code :

import os
os.system('wmctrl -a "Mozilla Firefox"')

ref wmctrl : https://askubuntu.com/questions/21262/shell-command-to-bring-a-program-window-in-front-of-another

-1

Just wanted to add in case it helps, I have a function for my program (It's a software for my PC's lighting I have this simple few line function:

def isRunning(process_name):
   foregroundWindow = GetWindowText(GetForegroundWindow())
   return process_name in foregroundWindow
Amy Gamble
  • 175
  • 1
  • 10
-10

Try using wxPython:

import wx
wx.GetActiveWindow()
Maehler
  • 6,111
  • 1
  • 41
  • 46
絢瀬絵里
  • 1,013
  • 2
  • 14
  • 27
  • 7
    I don't think that's what the OP is asking for. According to the wxpython docs, GetActiveWindow "Get the currently active window of this application". Thus, it's only going to return other wxPython windows. – Bryan Oakley Apr 22 '12 at 15:45