0

I want to call the native Windows CopyFileEx function from within a Rust program but I am having trouble getting a working example of LPPROGRESS_ROUTINE. I am a Java programmer learning Rust and the lack of OOP paradigms is a challenge for me. In Java, I would use a class that implemented an interface as my callback. However, it looks like the lpprogressroutine parameter is a pointer to a function, rather than a polymorphic object. But this should be fine; I can just declare a function and create a pointer to it.

I want the callback function to call a different function with additional arguments, so I created a wrapper struct to contain those additional arguments and the callback function:

use jni::objects::{JObject, JString, JValueGen};
use jni::strings::JavaStr;
use jni::sys::jint;
use jni::JNIEnv;
use windows::core::*;
use windows::Win32::Foundation::HANDLE;
use windows::Win32::Storage::FileSystem::{
    CopyFileExA, LPPROGRESS_ROUTINE, LPPROGRESS_ROUTINE_CALLBACK_REASON,
};

struct Callback<'a> {
    env: JNIEnv<'a>,
    ext_callback: JObject<'a>,
}

impl<'a> Callback<'a> {
    fn new(env: JNIEnv<'a>, ext_callback: JObject<'a>) -> Self {
        Callback {
            env: env,
            ext_callback: ext_callback,
        }
    }

    unsafe extern "system" fn invoke(
        &mut self,
        totalfilesize: i64,
        totalbytestransferred: i64,
        streamsize: i64,
        streambytestransferred: i64,
        _dwstreamnumber: u32,
        _dwcallbackreason: LPPROGRESS_ROUTINE_CALLBACK_REASON,
        _hsourcefile: HANDLE,
        _hdestinationfile: HANDLE,
        _lpdata: *const ::core::ffi::c_void,
    ) -> u32 {
        let arr = [
            JValueGen::Long(totalfilesize),
            JValueGen::Long(totalbytestransferred),
            JValueGen::Long(streamsize),
            JValueGen::Long(streambytestransferred),
        ];
        self.env
            .call_method(&self.callback, "onProgressEvent", "(IIII)V", &arr)
            .expect("Java callback failed");
        return 0;
    }
}

Now in order to access the env and ext_callback fields that I defined on the struct, I had to add the &mut self parameter to my invoke function. I think already this ruins the function signature so it wont work as a LPPROGRESS_ROUTINE, but perhaps not.

Continuing the endeavor, I create a method which will construct my Callback implementation and invoke the CopyFileExA function with a pointer to my method. This is where I am having trouble. I cannot figure out how to create a pointer to the callback method, since it is not static:

pub extern "system" fn Java_com_nhbb_util_natives_WindowsCopy_copy<'local>(
    mut env: JNIEnv<'local>,
    _object: JObject<'local>,
    source: JString<'local>,
    dest: JString<'local>,
    flags: jint,
    ext_callback: JObject<'local>,
) {
    let source_jstr: JavaStr = env.get_string(&source).expect("Invalid source string");
    let dest_jstr: JavaStr = env.get_string(&dest).expect("Invalid dest string");

    let source_arr = source_jstr.get_raw();
    let dest_arr = dest_jstr.get_raw();

    let source = source_arr as *const u8;
    let dest = dest_arr as *const u8;

    let flags: u32 = flags.try_into().unwrap();

    let callback = Callback::new(env, ext_callback);

    unsafe {
        CopyFileExA(
            PCSTR(source),
            PCSTR(dest),
            LPPROGRESS_ROUTINE::Some(callback::invoke),
    //                               ^^^^^^^^ use of undeclared crate or module `callback`
            None,
            None,
            flags,
        );
    }
}

I think I am just struggling from lack of experience working with this new language. Am I taking the correct approach using struct? Is there a better way to do this?

Botje
  • 26,269
  • 3
  • 31
  • 41
Cardinal System
  • 2,749
  • 3
  • 21
  • 42
  • Pass `self` as `lpData` and then get it from there in the callback. – Solomon Ucko Apr 06 '23 at 20:14
  • 2
    It appears that this is a Rust program within a Java program or something. I'd recommend removing the JNI stuff from your question. Currently it does not contain a [Minimal Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example). – PitaJ Apr 06 '23 at 20:15
  • 2
    Seems like CopyFileExA allows user-submitted data through the `lpData` argument, meaning that this is possible. Take a look at https://stackoverflow.com/questions/32270030/how-do-i-convert-a-rust-closure-to-a-c-style-callback – virchau13 Apr 07 '23 at 03:34

1 Answers1

2

The important thing to understand here is that callback::invoke is not a normal function already bound to the Callback object. There is no such thing as an implicit struct binding; Callback::invoke requires the Callback option as its first argument. This means it does not and will never work with LPPROGRESS_ROUTINE directly. LPPROGRESS_ROUTINE expects a static function.

So your solution is to create a static function that gets the callback option passed through its lpdata argument. That's exactly what the lpdata argument is for.

Note that because the lpdata argument gets passed to C and back, it will be a raw pointer and you will require unsafe to use it.

Here is a demonstration of how this would work. Note that I stripped out all the JNI stuff, but the same principle should work with JNI as well.

use std::ffi::c_void;

use encoding_rs::WINDOWS_1252;
use windows::core::*;
use windows::Win32::Foundation::HANDLE;
use windows::Win32::Storage::FileSystem::{
    CopyFileExA, LPPROGRESS_ROUTINE, LPPROGRESS_ROUTINE_CALLBACK_REASON,
};

struct Callback<'a> {
    ext_callback: &'a mut dyn FnMut(i64, i64, i64, i64) -> u32,
}

impl<'a> Callback<'a> {
    fn new(ext_callback: &'a mut dyn FnMut(i64, i64, i64, i64) -> u32) -> Self {
        Callback { ext_callback }
    }

    unsafe extern "system" fn invoke(
        totalfilesize: i64,
        totalbytestransferred: i64,
        streamsize: i64,
        streambytestransferred: i64,
        _dwstreamnumber: u32,
        _dwcallbackreason: LPPROGRESS_ROUTINE_CALLBACK_REASON,
        _hsourcefile: HANDLE,
        _hdestinationfile: HANDLE,
        lpdata: *const c_void,
    ) -> u32 {
        let this_ptr: *mut Self = lpdata.cast_mut().cast();
        let this = this_ptr.as_mut().unwrap();

        (this.ext_callback)(
            totalfilesize,
            totalbytestransferred,
            streamsize,
            streambytestransferred,
        )
    }
}

fn main() {
    let mut ext_callback =
        |totalfilesize, totalbytestransferred, streamsize, streambytestransferred| {
            println!(
                "Progress: File: {}/{}, Stream Size: {}, Stream bytes transferred: {}",
                totalbytestransferred, totalfilesize, streamsize, streambytestransferred,
            );
            0
        };

    let mut callback = Callback::new(&mut ext_callback);

    // IMPORTANT: Rust strings are UTF-8, but `CopyFileExA` requires an ANSI/Windows-1252 string!
    // Use `encoding_rs` to convert between the two.
    let mut source = WINDOWS_1252.encode("file_a.txt").0.into_owned();
    let mut dest = WINDOWS_1252.encode("file_b.txt").0.into_owned();
    // Add null termination
    source.push(0);
    dest.push(0);

    unsafe {
        CopyFileExA(
            PCSTR::from_raw(source.as_ptr()),
            PCSTR::from_raw(dest.as_ptr()),
            LPPROGRESS_ROUTINE::Some(Callback::invoke),
            Some(((&mut callback) as *mut Callback).cast()),
            None,
            0,
        );
    }
}
Progress: File: 0/23, Stream Size: 23, Stream bytes transferred: 0
Progress: File: 23/23, Stream Size: 23, Stream bytes transferred: 23

FURTHER REMARKS, IMPORANT:

The encoding of PCSTR depends on the locale, and should therefore not be used as shown previously. While most western locales do indeed use Windows-1252, many others do not.

It is therefore recommended to use PCWSTR instead, which is globally unambiguous.

According to the Windows API docs, a PCWSTR is encoded as UTF-16LE.

This is how the previous code would be rewritten for it:

use std::ffi::c_void;
use std::iter;

use windows::core::*;
use windows::Win32::Foundation::HANDLE;
use windows::Win32::Storage::FileSystem::{
    CopyFileExW, LPPROGRESS_ROUTINE, LPPROGRESS_ROUTINE_CALLBACK_REASON,
};

struct Callback<'a> {
    ext_callback: &'a mut dyn FnMut(i64, i64, i64, i64) -> u32,
}

impl<'a> Callback<'a> {
    fn new(ext_callback: &'a mut dyn FnMut(i64, i64, i64, i64) -> u32) -> Self {
        Callback { ext_callback }
    }

    unsafe extern "system" fn invoke(
        totalfilesize: i64,
        totalbytestransferred: i64,
        streamsize: i64,
        streambytestransferred: i64,
        _dwstreamnumber: u32,
        _dwcallbackreason: LPPROGRESS_ROUTINE_CALLBACK_REASON,
        _hsourcefile: HANDLE,
        _hdestinationfile: HANDLE,
        lpdata: *const c_void,
    ) -> u32 {
        let this_ptr: *mut Self = lpdata.cast_mut().cast();
        let this = this_ptr.as_mut().unwrap();

        (this.ext_callback)(
            totalfilesize,
            totalbytestransferred,
            streamsize,
            streambytestransferred,
        )
    }
}

fn main() {
    let mut ext_callback =
        |totalfilesize, totalbytestransferred, streamsize, streambytestransferred| {
            println!(
                "Progress: File: {}/{}, Stream Size: {}, Stream bytes transferred: {}",
                totalbytestransferred, totalfilesize, streamsize, streambytestransferred,
            );
            0
        };

    let mut callback = Callback::new(&mut ext_callback);

    // IMPORTANT: Rust strings are UTF-8, but `CopyFileExW`
    // requires UTF-16 with null termination.
    let source: Vec<u16> = "file__a.txt"
        .encode_utf16()
        .chain(iter::once(0)) // null termination
        .collect();
    let dest: Vec<u16> = "file__b.txt"
        .encode_utf16()
        .chain(iter::once(0)) // null termination
        .collect();

    unsafe {
        CopyFileExW(
            PCWSTR::from_raw(source.as_ptr()),
            PCWSTR::from_raw(dest.as_ptr()),
            LPPROGRESS_ROUTINE::Some(Callback::invoke),
            Some(((&mut callback) as *mut Callback).cast()),
            None,
            0,
        );
    }
}
Progress: File: 0/30, Stream Size: 30, Stream bytes transferred: 0
Progress: File: 30/30, Stream Size: 30, Stream bytes transferred: 30
Finomnis
  • 18,094
  • 1
  • 20
  • 27
  • 1
    ANSI isn't necessarily Windows-1252, that's just the most common configuration. Either [only use ASCII (which should always work the same on all of them)](https://learn.microsoft.com/en-us/windows/win32/intl/code-pages); check what the current code page is using [`AreFileApisANSI`](https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-arefileapisansi), [`GetACP`](https://learn.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-getacp), and [`GetOEMCP`](https://learn.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-getoemcp); ... – Solomon Ucko Apr 07 '23 at 11:18
  • 1
    ... [use a manifest to request UTF-8](https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page), or use the `W` versions of most functions (except some internet stuff, for which `A` is the primary). – Solomon Ucko Apr 07 '23 at 11:22
  • For literals, use [`w!`](https://microsoft.github.io/windows-docs-rs/doc/windows/macro.w.html). For any `&str`, you can use [`HSTRING::from`](https://microsoft.github.io/windows-docs-rs/doc/windows/core/struct.HSTRING.html#impl-From%3C%26str%3E-for-HSTRING), or [`encode_utf16`](https://doc.rust-lang.org/std/primitive.str.html#method.encode_utf16) + [`Iterator::chain`](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.chain)`(std::iter::once(0))` + [`Iterator::collect`](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.collect) into Vec, or similar. – Solomon Ucko Apr 07 '23 at 21:53
  • Isn't `PCWSTR` just the type of a pointer to a null-terminated UTF-16 string that can contain unpaired surrogates? – Solomon Ucko Apr 07 '23 at 21:54
  • Vice-versa, I think: UTF-16 is a subset of `PWSTR`. https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/a66edeb1-52a0-4d64-a93b-2f5c833d7d92#gt_fd33af2e-e1ce-4f8e-a706-f9fb8123f9b0 defines "Unicode character" as "a 16-bit UTF-16 code unit". [Unicode 5.0.0](https://www.unicode.org/versions/Unicode5.0.0/), [section 2.7 "Unicode Strings"](http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf#page=31) says: "a Unicode 16-bit string is an ordered sequence of 16-bit code units [...] Unicode 16-bit strings [...] are not necessarily well-formed UTF-16 sequences" – Solomon Ucko Apr 08 '23 at 21:08
  • 1
    @SolomonUcko I think this is the important part: `Unless otherwise specified, all Unicode strings follow the UTF-16LE encoding scheme with no Byte Order Mark (BOM)` (https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/a66edeb1-52a0-4d64-a93b-2f5c833d7d92#gt_fd33af2e-e1ce-4f8e-a706-f9fb8123f9b0) – Finomnis Apr 09 '23 at 06:58
  • @SolomonUcko Updated my answer to include a section about `PCWSTR`. – Finomnis Apr 09 '23 at 13:07