5

Introduction

Currently, I'm working for a customer who wants to automatize some actions inside their Accounting application.

Problem

I searched a crate for this but I didn't find anything, to read the screen of another window, and post some message like key interaction or click interaction.

Question

Does someone know a crate for interacting with another window in Rust? I need for the interaction: window screen reading, post some key message, and post some click message to this window.

ThrowsError
  • 1,169
  • 1
  • 11
  • 43

2 Answers2

1

There are various crates that let you simulate user input (e.g. mouse and keyboard input) even in a cross-platform fashion:

and crates for taking screenshots like

Apart from that there also is autopilot which lets you do both.

Here is an example for capturing the main windows' screen using autopilot and image (for actually storing the image):

use image::{GenericImageView, png::PNGEncoder};

fn main() {
    let bitmap = autopilot::bitmap::capture_screen().expect("Failed to capture main screen.");

    let mut buf = Vec::new();
    let encoder = PNGEncoder::new(
        &mut buf
    );
    encoder
        .encode(
            &bitmap.image.as_rgb8().unwrap(),
            bitmap.image.width(),
            bitmap.image.height(),
            image::ColorType::RGB(8),
        )
        .expect("Failed to encode png.");

    std::fs::write("test.png", buf).expect("Failed to write screenshot to disk.");
}

and here is an example for mouse input (move cursor in a circle):

const MARGIN: f64 = 10.0;
const MILLIS: u64 = 10;

struct Center {
    x: f64,
    y: f64,
}

fn main() {
    circle_mouse().expect("Unable to move mouse");
}

fn circle_mouse() -> Result<(), autopilot::mouse::MouseError> {
    let screen_size = autopilot::screen::size();
    let scoped_height = screen_size.height / 2.0 - MARGIN;
    let scoped_width = screen_size.width / 2.0 - MARGIN;
    let scoped_radius;

    if scoped_height > scoped_width {
        scoped_radius = scoped_width;
    }
    else {
        scoped_radius = scoped_height;
    }

    let center = Center { x: scoped_width, y: scoped_height };

    for i in 0..360 {
        let x = (i as f64 / 180.0 * std::f64::consts::PI).cos() * scoped_radius;
        let y = (i as f64 / 180.0 * std::f64::consts::PI).sin() * scoped_radius;
        autopilot::mouse::move_to(autopilot::geometry::Point::new(
            center.x + x as f64,
            center.y + y as f64,
        ))?;
        std::thread::sleep(std::time::Duration::from_millis(MILLIS));
    }

    Ok(())
}

If you only want to capture the area of the window, you can do so by taking a screenshot of the full desktop and then cropping it to the window only. On windows you can get the window rect of a certain window using GetWindowRect. Here is a snippet for getting the window rect using its ID or name.

Update due to request in comment

Here is a sample for how to only capture a specifc portion of the screen that contains a given window (works only on windows and the window must be fully visible on screen):

use autopilot::geometry::{Point, Rect, Size};
use std::{ffi::OsString, iter::once, os::windows::prelude::OsStrExt, ptr::null};
use windows_sys::Win32::{
    Foundation::{HWND, RECT},
    UI::WindowsAndMessaging::{FindWindowW, GetWindowRect},
};

fn main() {
    // The title of the window (is shown when hovering over the window in the taskbar):
    let window_name = OsString::from("autopilot");
    let window_name: Vec<u16> = window_name
        .as_os_str()
        .encode_wide()
        .chain(once(0))
        .collect();
    let id: HWND = unsafe { FindWindowW(null(), window_name.as_ptr()) };
    let mut rect = RECT {
        left: 0,
        top: 0,
        right: 0,
        bottom: 0,
    };
    if id != 0 && unsafe { GetWindowRect(id, &mut rect) } != 0 {
        /* println!(
            "HWND: {}\nLocation: {} {}\nSize: {} {}",
            id,
            rect.left,
            rect.top,
            rect.right - rect.left,
            rect.bottom - rect.top
        ); */

        let bitmap = autopilot::bitmap::capture_screen_portion(Rect::new(
            Point::new(rect.left as f64, rect.top as f64),
            Size::new(
                (rect.right - rect.left) as f64,
                (rect.bottom - rect.top) as f64,
            ),
        ))
        .expect("Failed to capture screen portion.");
        bitmap
            .image
            .save("screen_portion.png")
            .expect("Failed to write image to disk.");
    }
}

This sample has the following dependencies:

autopilot = "0.4.0"
windows-sys = { version = "0.36.1", features = ["Win32_Foundation", "Win32_UI_WindowsAndMessaging"] }

Note that this does include the invisible window borders which you probably do not want. You can use DwmGetWindowAttribute to correct for the visual offset like this:

use autopilot::geometry::{Point, Rect, Size};
use std::{
    ffi::{c_void, OsString},
    iter::once,
    mem::size_of,
    os::windows::prelude::OsStrExt,
    ptr::null,
};
use windows_sys::Win32::{
    Foundation::{HWND, RECT},
    Graphics::Dwm::{DwmGetWindowAttribute, DWMWA_EXTENDED_FRAME_BOUNDS},
    UI::WindowsAndMessaging::{FindWindowW, GetWindowRect},
};

fn main() {
    // The title of the window (is shown when hovering over the window in the taskbar):
    let window_name = OsString::from("autopilot");
    let window_name: Vec<u16> = window_name
        .as_os_str()
        .encode_wide()
        .chain(once(0))
        .collect();
    let id: HWND = unsafe { FindWindowW(null(), window_name.as_ptr()) };
    let mut rect = RECT {
        left: 0,
        top: 0,
        right: 0,
        bottom: 0,
    };
    if id != 0 && unsafe { GetWindowRect(id, &mut rect) } != 0 {
        /* println!(
            "Window:\nHWND: {}\nLocation: {} {}\nSize: {} {}",
            id,
            rect.left,
            rect.top,
            rect.right - rect.left,
            rect.bottom - rect.top
        ); */

        let frame = Box::new(RECT {
            left: 0,
            top: 0,
            right: 0,
            bottom: 0,
        });
        let frame_ptr = Box::into_raw(frame);
        let _res = unsafe {
            DwmGetWindowAttribute(
                id,
                DWMWA_EXTENDED_FRAME_BOUNDS,
                frame_ptr as *mut c_void,
                size_of::<RECT>() as u32,
            )
        };

        let frame = unsafe { Box::from_raw(frame_ptr) };

        let border = RECT {
            left: frame.left - rect.left,
            top: frame.top - rect.top,
            right: rect.right - frame.right,
            bottom: rect.bottom - frame.bottom,
        };

        let adjusted_rect = RECT {
            left: rect.left + border.left,
            top: rect.top + border.top,
            right: rect.right - border.right,
            bottom: rect.bottom - border.bottom,
        };

        // Window must be fully on screen for capture.
        if rect.left >= 0 && rect.top >= 0 {
            let bitmap = autopilot::bitmap::capture_screen_portion(Rect::new(
                Point::new(adjusted_rect.left as f64, adjusted_rect.top as f64),
                Size::new(
                    (adjusted_rect.right - adjusted_rect.left) as f64,
                    (adjusted_rect.bottom - adjusted_rect.top) as f64,
                ),
            ))
            .expect("Failed to capture screen portion.");
            bitmap
                .image
                .save("screen_portion.png")
                .expect("Failed to write image to disk.");
        }
    }
}

using those dependencies

autopilot = "0.4.0"
windows-sys = { version = "0.36.1", features = ["Win32_Foundation", "Win32_Graphics_Dwm", "Win32_UI_WindowsAndMessaging"] }

Update due to another comment

Yes you can use the PostMessageW function from the WinAPI in rust, too. Here is a simple sample, that contains the basic idea of the linked sample:

use std::{ffi::OsString, iter::once, os::windows::prelude::OsStrExt, ptr::null};
use windows_sys::Win32::{
    Foundation::HWND,
    UI::{
        Input::KeyboardAndMouse::VK_LEFT,
        WindowsAndMessaging::{FindWindowW, PostMessageW},
    },
};

const KEY_DOWN: u32 = 256;
const KEY_UP: u32 = 257;

fn main() {
    // The title of the window (is shown when hovering over the window in the taskbar):
    let window_name = OsString::from("autopilot");
    let window_name: Vec<u16> = window_name
        .as_os_str()
        .encode_wide()
        .chain(once(0))
        .collect();
    let id: HWND = unsafe { FindWindowW(null(), window_name.as_ptr()) };
    
    let res = unsafe { PostMessageW(id, KEY_DOWN, VK_LEFT.into(), 0) };
    if res == 0 {
        panic!("Failed to post message to window.");
    }

    std::thread::sleep(std::time::Duration::from_millis(500));
    let res = unsafe { PostMessageW(id, KEY_UP, VK_LEFT.into(), 0) };
    if res == 0 {
        panic!("Failed to post message to window.");
    }
}

it depends on

windows-sys = { version = "0.36.1", features = ["Win32_Foundation", "Win32_UI_Input_KeyboardAndMouse", "Win32_UI_WindowsAndMessaging"] }

If you want to detect certain UI elements on the screen and get their position, you would probably need to implement this yourself using pattern matching / computer vision using something like opencv and using the screenshot taken beforehand as input.

frankenapps
  • 5,800
  • 6
  • 28
  • 69
  • Thank you for your answer! Did you know if it is possible to get the screenshot only of a window? and not the screenshot all of the screen? And did you know if it is possible to check just a pixel on the screen without taking a screenshot? – ThrowsError Aug 02 '22 at 13:32
  • 1
    For screenshotting only the window I already laid out the path for how to do it in my original answer, but I have now added a code sample, too. See the updated answer. If you only want to get the color of a single pixel on screen, you can use `autopilot`s [`get_color`](https://docs.rs/autopilot/latest/autopilot/screen/fn.get_color.html). – frankenapps Aug 03 '22 at 06:36
  • I really appreciate your update and help! I see that the keystroke message is sent to the computer and not to a single window. I found a code that produces what I was looking for, but it is written in C++: https://gist.github.com/MathieuSoysal/d9f18cf2232477b28c7a51b402dc8eb5#file-pos_example-cpp-L59-L60 Do you think it is possible to do the same thing in Rust? And do you know if it is possible to get the color pixel or window display, for a window that is not displayed on the screen? – ThrowsError Aug 04 '22 at 02:59
  • 1
    This is a different question than the one you asked initially, but I'll do my best to answer it. I do not know how to retrieve the color of a pixel or a screenshot of a window outside of the screen bounds, but I am not sure if there might be any way to do it. Regarding the linked sample I added a sample that translates the basic idea to rust. – frankenapps Aug 04 '22 at 06:33
  • Thank you for all! I found a solution to capture directly and only the window (and works with a reduced window). But this solution is written in Python: https://github.com/May2Beez/NosGame_by_May2Bee/blob/322aa9eab3ce3408e71148903b6de8bd2efe7c31/WindowCapture.py#L46-L81 Do you think it is possible to do this in Rust? – ThrowsError Aug 04 '22 at 15:58
  • 1
    As far as I can tell this should only work if you control the GUI yourself. Did you test it on the program in question? – frankenapps Aug 04 '22 at 17:28
1

This is not a very easy thing to do... doing this might take a lot of time.

If you go for @frankenapps' answer, I would recommend that you use something like opencv or some kind of AI to recognize the UI and then click/do any action depending on it.

There is also another way to do such of task. You can use Frida which will let you attach to the program's functions and change values. You would have to do some analysis on the binary to understand it though.

Here is a nice example to understand how it works. And this is the rust-binding

Both ways are going to take a while but I just wanted to add another solution. Have fun!

Ricardo
  • 1,308
  • 1
  • 10
  • 21