4

Getting the current mouse pointer position in Python is trivial, through the use of the Windows API ctypes libraries. However, it seem that taking the step from the mouse pointer's screen position position (x,y), to get the current text cursor position of the current terminal window, seem a huge difficulty.

In addition it makes it worse that the programmer community keep confusing mouse pointer position with text cursor position. Historically there never was a mouse "cursor" so when people are saying "cursor", they should mean text cursor, and not the other way around. Because of this error, Stackoverflow is full of questions and answers, relating to "cursor", but seemingly none relates to getting the terminal shell's current character position. [The cursed cursor!]

To get the relative mouse pointer position:

from ctypes import windll, wintypes, byref
def get_cursor_pos():
    cursor = wintypes.POINT()
    windll.user32.GetCursorPos(byref(cursor))
    return (cursor.x, cursor.y)

while(1): print('{}\t\t\r'.format(get_cursor_pos()), end='')

I want to have a function that give me the last position in terms of character row and column. Perhaps something like this:

def cpos(): 
    xy = here_be_magic()
    return xy

# Clear screen and start from top:
print('\x1b[H', end=''); 
print('12345', end='', flush=True); xy=cpos(); print('( {},{})'.format(xy[0],xy[1]),end='', flush=True)

# 12345 (1,5)  # written on top of blank screen

How do I get the text cursor position in (row,column) inside my terminal?
(And without making any assumptions and having to write my own window manager?)

Ultimately I hope to use this to find the last cursor position in any terminal window, (and possibly used by any program?)


Possibly related (but not useful) SO Questions:


UPDATE (2022-01-17)

Looking through the MS documentation, I am now convinced it should be possible to get this from the (older, non-VT-based) API call, GetConsoleScreenBufferInfo which is given like this.

BOOL WINAPI GetConsoleScreenBufferInfo(
  _In_  HANDLE                      hConsoleOutput,            # A handle to the console screen buffer. The handle must have the GENERIC_READ access right. 
  _Out_ PCONSOLE_SCREEN_BUFFER_INFO lpConsoleScreenBufferInfo  # A pointer to a CONSOLE_SCREEN_BUFFER_INFO structure that receives the console screen buffer information.
);

typedef struct _CONSOLE_SCREEN_BUFFER_INFO {
  COORD      dwSize;                # contains the size of the console screen buffer, in character columns and rows.
  COORD      dwCursorPosition;      # contains the column and row coordinates of the cursor in the console screen buffer.
  WORD       wAttributes;           # Character attributes (divided into two classes: color and DBCS)
  SMALL_RECT srWindow;              # A SMALL_RECT structure that contains the console screen buffer coordinates of the upper-left and lower-right corners of the display window.
  COORD      dwMaximumWindowSize;   # A COORD structure that contains the maximum size of the console window, in character columns and rows, given the current screen buffer size and font and the screen size.
} CONSOLE_SCREEN_BUFFER_INFO;       # 

# Defines the coordinates of a character cell in a console screen buffer. 
# The origin of the coordinate system (0,0) is at the top, left cell of the buffer.

typedef struct _COORD {
  SHORT X;              # The horizontal coordinate or column value. The units depend on the function call.
  SHORT Y;              # The vertical coordinate or row value. The units depend on the function call.
} COORD, *PCOORD;


typedef struct _SMALL_RECT {
  SHORT Left;
  SHORT Top;
  SHORT Right;
  SHORT Bottom;
} SMALL_RECT;

So in light of this, I was thinking the following would work.

cls='\x1b[H'
from ctypes import windll, wintypes, byref
def cpos():
    cursor = wintypes._COORD(ctypes.c_short)
    windll.kernel32.GetConsoleScreenBufferInfo(byref(cursor))
    return (cursor.X, cursor.Y)

cpos()

# TypeError: '_ctypes.PyCSimpleType' object cannot be interpreted as an integer
not2qubit
  • 14,531
  • 8
  • 95
  • 135
  • Not sure how to do that with python but on Windows you can use UI Automation, like this: https://stackoverflow.com/questions/4665045/how-to-get-the-word-under-the-cursor-in-windows and with the text pattern, you can get current selection and bounding rects from text pattern: https://learn.microsoft.com/en-us/dotnet/api/system.windows.automation.text.textpatternrange.getboundingrectangles – Simon Mourier Jan 16 '22 at 18:33
  • Yeah, as always the MS docs are pretty useless for any practical purposes. I don't understand why they don't show pictures, when they talk about graphical entities. Just too much guesswork for my taste. So, sorry, was not very helpful, unless someone can come up with a working example. – not2qubit Jan 16 '22 at 18:53
  • My first guess, was that I could try to calculate where is the origin of the window and then perhaps use the font-size to figure out where things are. – not2qubit Jan 16 '22 at 18:58
  • `dwCursorPosition` is the text cursor position (not mouse cursor). – ssbssa Jan 17 '22 at 11:53

2 Answers2

1

The problem was to locate the various Structure definitions. After having experimented significantly, I've got the following working solution.

#!/usr/bin/env python -u
# -*- coding: UTF-8 -*-
#------------------------------------------------------------------------------
from ctypes import windll, wintypes, Structure, c_short, c_ushort, byref, c_ulong
from readline import console

#------------------------------------------------
# Win32 API 
#------------------------------------------------
SHORT   = c_short
WORD    = c_ushort
DWORD   = c_ulong

STD_OUTPUT_HANDLE   = DWORD(-11)    # $CONOUT

# These are already defined, so no need to redefine.
COORD = wintypes._COORD
SMALL_RECT = wintypes.SMALL_RECT
CONSOLE_SCREEN_BUFFER_INFO = console.CONSOLE_SCREEN_BUFFER_INFO

#------------------------------------------------
# Main
#------------------------------------------------
wk32 = windll.kernel32

hSo = wk32.GetStdHandle(STD_OUTPUT_HANDLE)
GetCSBI = wk32.GetConsoleScreenBufferInfo

def cpos():
    csbi = CONSOLE_SCREEN_BUFFER_INFO()
    GetCSBI(hSo, byref(csbi))
    xy = csbi.dwCursorPosition
    return '({},{})'.format(xy.X,xy.Y)

cls='\x1b[H'
print('\n'*61)
print(cls+'12345', end='', flush=True); print(' {}'.format(cpos()), flush=True)

# 12345 (5,503)
not2qubit
  • 14,531
  • 8
  • 95
  • 135
  • BTW. [Here](https://stackoverflow.com/q/17993814/1147688) is an interesting way of using `unpack` for the structure. – not2qubit Jan 19 '22 at 16:18
1

The OP's solution fails in (my) Windows environment:

Python 3.8.6 (tags/v3.8.6:db45529, Sep 23 2020, 15:52:53) [MSC v.1927 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import readline
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'readline'
>>>

and pip install readline returns error: this module is not meant to work on Windows (truncated).

I have found another solution for pure Windows environment (without the GNU readline interface. Desired structure definitions borrowed at programtalk.com, see lines 1..91:

# from winbase.h
STDOUT = -11
STDERR = -12

from ctypes import (windll, byref, Structure, c_char, c_short, c_uint32,
  c_ushort, ArgumentError, WinError)

handles = {
    STDOUT: windll.kernel32.GetStdHandle(STDOUT),
    STDERR: windll.kernel32.GetStdHandle(STDERR),
}

SHORT = c_short
WORD = c_ushort
DWORD = c_uint32
TCHAR = c_char

class COORD(Structure):
    """struct in wincon.h"""
    _fields_ = [
        ('X', SHORT),
        ('Y', SHORT),
    ]

class  SMALL_RECT(Structure):
    """struct in wincon.h."""
    _fields_ = [
        ("Left", SHORT),
        ("Top", SHORT),
        ("Right", SHORT),
        ("Bottom", SHORT),
    ]

class CONSOLE_SCREEN_BUFFER_INFO(Structure):
    """struct in wincon.h."""
    _fields_ = [
        ("dwSize", COORD),
        ("dwCursorPosition", COORD),
        ("wAttributes", WORD),
        ("srWindow", SMALL_RECT),
        ("dwMaximumWindowSize", COORD),
    ]
    def __str__(self):
        """Get string representation of console screen buffer info."""
        return '(%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d)' % (
            self.dwSize.Y, self.dwSize.X
            , self.dwCursorPosition.Y, self.dwCursorPosition.X
            , self.wAttributes
            , self.srWindow.Top, self.srWindow.Left, self.srWindow.Bottom, self.srWindow.Right
            , self.dwMaximumWindowSize.Y, self.dwMaximumWindowSize.X
        )

def GetConsoleScreenBufferInfo(stream_id=STDOUT):
    """Get console screen buffer info object."""
    handle = handles[stream_id]
    csbi = CONSOLE_SCREEN_BUFFER_INFO()
    success = windll.kernel32.GetConsoleScreenBufferInfo(
        handle, byref(csbi))
    if not success:
        raise WinError()
    return csbi

### end of https://programtalk.com/vs4/python/14134/dosage/dosagelib/colorama.py/

clrcur = '\x1b[H'  # move cursor to the top left corner
clrscr = '\x1b[2J' # clear entire screen (? moving cursor ?)
# '\x1b[H\x1b[2J'
from ctypes import windll, wintypes, byref
def get_cursor_pos():
    cursor = wintypes.POINT()
    aux = windll.user32.GetCursorPos(byref(cursor))
    return (cursor.x, cursor.y)

mouse_pos = get_cursor_pos()
# print('mouse at {}'.format(mouse_pos))

def cpos():
    csbi = GetConsoleScreenBufferInfo()
    return '({},{})'.format(csbi.dwCursorPosition.X, csbi.dwCursorPosition.Y)
    
print('12345', end='', flush=True)
print(' {}'.format(cpos()), flush=True)

# an attempt to resolve discrepancy between buffer and screen size
# in the following code snippet:
import sys
if len(sys.argv) > 1 and len(sys.argv[1]) > 0:
    csbi = GetConsoleScreenBufferInfo()
    keybd_pos = (csbi.dwCursorPosition.X, csbi.dwCursorPosition.Y)
    print('\nkbd buffer at {}'.format(keybd_pos))
    import os
    screensize = os.get_terminal_size()
    keybd_poss = ( csbi.dwCursorPosition.X,  
                   min( csbi.dwSize.Y,
                        csbi.dwCursorPosition.Y,
                        csbi.dwMaximumWindowSize.Y,
                        screensize.lines))
    # screen line number is incorrectly computed if termial is scroll-forwarded
    print('kbd screen at {} (disputable? derived from the following data:)'
          .format(keybd_poss))
    print( 'csbi.dwSize   ', (csbi.dwSize.X, csbi.dwSize.Y))
    print( 'terminal_size ', (screensize.columns, screensize.lines))
    print( 'csbi.dwMaxSize', (csbi.dwMaximumWindowSize.X, csbi.dwMaximumWindowSize.Y))
    print( 'csbi.dwCurPos ', (csbi.dwCursorPosition.X, csbi.dwCursorPosition.Y))

Output: .\SO\70732748.py

12345 (5,526)

From line 113, there is an attempt to resolve discrepancy disparity between buffer and screen size (unsuccessful, absolute screen line number is incorrectly computed, at least if terminal is scroll-forwarded). The difference does not appear in Windows Terminal where always buffer height == window height, and all those calculations are needless…

Example: .\SO\70732748.py x

12345 (5,529)

kbd buffer at (0, 530)
kbd screen at (0, 36) (disputable? derived from the following data:)
csbi.dwSize    (89, 1152)
terminal_size  (88, 36)
csbi.dwMaxSize (89, 37)
csbi.dwCurPos  (0, 530)
not2qubit
  • 14,531
  • 8
  • 95
  • 135
JosefZ
  • 28,460
  • 5
  • 44
  • 83
  • You don't install `readline` because it's built-in. You just import it. – not2qubit Jan 20 '22 at 23:37
  • Also there is no discrepancy, as the buffer size is independent from screen size, in powershell (old & *core*). However, in *Windows Terminal* it is now slightly different and matching, so that `buffer height = window height`, as explained [here](https://github.com/PowerShell/PowerShell/issues/16732). – not2qubit Jan 20 '22 at 23:44
  • @not2qubit: answer updated; I knew different behaviour `cmd` vs `wt` already; thanks iac. for the link… – JosefZ Jan 21 '22 at 15:33
  • Yeah, that is a `Py3.8` issue. There you need to use `pyreadline` like this: `from pyreadline import Readline; readline = Readline()`. In `py3.10`, it just works! – not2qubit Jan 21 '22 at 18:09
  • PS. I removed that silly copyright message from the code snippet, because it is simply not valid, as the code is basically just MS Windows API examples. – not2qubit Jan 21 '22 at 18:12