1

This appears to partially work but I cannot get the string value to print

pub fn test() {
    let mut buf: Vec<u16> = vec![0; 64];
    let mut sz: DWORD = 0;
    unsafe {
        advapi32::GetUserNameW(buf.as_mut_ptr(), &mut sz);
    }
    let str1 = OsString::from_wide(&buf).into_string().unwrap();
    println!("Here: {} {}", sz, str1);
}

Prints:

Here: 10

When I expect it to also print

Here: 10 <username>

As a test, the C version

TCHAR buf[100];
DWORD sz;
GetUserName(buf, &sz);

seems to populate buf fine.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Delta_Fore
  • 3,079
  • 4
  • 26
  • 46

1 Answers1

10

GetUserName

You should re-read the API documentation for GetUserName to recall how the arguments work:

lpnSize [in, out]

On input, this variable specifies the size of the lpBuffer buffer, in TCHARs. On output, the variable receives the number of TCHARs copied to the buffer, including the terminating null character. If lpBuffer is too small, the function fails and GetLastError returns ERROR_INSUFFICIENT_BUFFER. This parameter receives the required buffer size, including the terminating null character.

TL;DR:

  • On input: caller tells the API how many spaces the buffer has.
  • On success: API tells the caller how many spaces were used.
  • On failure: API tells the caller how many spaces were needed.

C version

This has a fixed-size stack-allocated array of 100 TCHARs.

This code is broken and unsafe because sz is uninitialized. This allows the API to write an undefined number of characters to a buffer that's only 100 long. If the username is over 100 characters, you've just introduced a security hole into your program.

Rust version

The Rust code is broken in a much better way. sz is set to zero, which means "you may write zero entries of data", so it writes zero entries. Thus, the Vec buffer is full of zeros and the resulting string is empty. The buffer is reported too small to receive the username, so GetUserNameW sets sz to the number of characters that the buffer needs to have allocated.

What to do

One "fix" would be to set sz to the length of your array. However, this is likely to have over- or under-allocated the buffer.

If you are ok with a truncated string (and I'm not sure if TCHAR strings can be split arbitrarily, I know UTF-8 cannot), then it would be better to use a fixed-size array like the C code.

If you want to more appropriately allocate memory to call this type of WinAPI function, see What is the right way to allocate data to pass to an FFI call?.

extern crate advapi32;
extern crate winapi;

use std::ptr;

fn get_user_name() -> String {
    unsafe {
        let mut size = 0;
        let retval = advapi32::GetUserNameW(ptr::null_mut(), &mut size);
        assert_eq!(retval, 0, "Should have failed");

        let mut username = Vec::with_capacity(size as usize);
        let retval = advapi32::GetUserNameW(username.as_mut_ptr(), &mut size);
        assert_ne!(retval, 0, "Perform better error handling");
        assert!((size as usize) <= username.capacity());
        username.set_len(size as usize);

        // Beware: This leaves the trailing NUL character in the final string,
        // you may want to remove it!
        String::from_utf16(&username).unwrap()
    }
}

fn main() {
    println!("{:?}", get_user_name()); // "IEUser\u{0}"
}
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • "*I'm not sure if TCHAR strings can be split arbitrarily, I know UTF-8 cannot*" - `TCHAR` is the same way. If compiling for Unicode, `TCHAR` is `WCHAR` and uses UTF-16. Otherwise, it is `CHAR` and uses ANSI/MBCS. Either way, the data is just as variable-length as UTF-8 is, and can potentially split Unicode characters across multiple `TCHAR`s. – Remy Lebeau Jun 23 '17 at 00:40
  • 1
    Wide strings in Windows aren't *really* UTF-16; rather, they're *potentially* UTF-16, but they can also be arbitrary binary data that won't decode as Unicode. A `WCHAR` string *can* be split anywhere, because WinAPI more or less doesn't care if the result is valid or not, and when you're writing code against WinAPI, you're not allowed to assume the strings you get are valid. Such is life. – DK. Jun 23 '17 at 02:18
  • 1
    @DK.: Any references for those claims? They sound, quite frankly, pretty much made up. – IInspectable Jun 24 '17 at 00:09
  • To be pedantic, wide-character strings in Windows, consisting of 16-bit type wchar_t characters, are UCS-2, not UTF-16. (UTF-16 can have escapes for characters over 65535, just like UTF-8 can have escapes for characters over 255.) – Dan Korn Jun 24 '17 at 01:12
  • @IInspectable: Well, there's [the Rust issue about changing how filesystem paths are handled](https://github.com/rust-lang/rust/issues/12056) because of this. I don't have an exhaustive list of places in the Win32 API where strings may or may not be valid Unicode, because as far as I know there is no such list. – DK. Jun 24 '17 at 08:51
  • @DK.: There is no such list, because Windows uses UTF-16 throughout. The file handling APIs may be special, in that they can pass pathnames unchecked downstream the driver stack. In case of NTFS, any sequence of UTF-16 code units is legal. Those need not form valid code points. That's specific to the filesystem, though, not the Windows API. – IInspectable Jun 24 '17 at 11:28
  • @DanKorn: Not being pedantic, but Windows uses **true** UTF-16 ([since Windows 2000](https://en.wikipedia.org/wiki/Unicode_in_Microsoft_Windows)). It supported UCS-2 only, at the time Windows NT was in development. Because UTF-16 hadn't been invented yet. Don't keep spreading the myth. – IInspectable Jun 24 '17 at 11:34
  • From the documentation: "lpBuffer: A pointer to the buffer to receive the user's logon name. If this buffer is not large enough to contain the entire user name, the function fails. A buffer size of (UNLEN + 1) characters will hold the maximum length user name including the terminating null character. UNLEN is defined in Lmcons.h." In other words, consider using `UNLEN+1` to make the static initialization safe, i.e. `TCHAR buf[UNLEN+1];` – GaspardP Oct 08 '19 at 02:16