5

I found a lot of information on US-ANSI strings for a Rust DLL implementation in C#, but this does not solve any issues for UTF-8 encoded strings.

For example, "Brötchen", once called in C#, results in "Brötchen".

Rust

use std::os::raw::c_char;
use std::ffi::CString;

#[no_mangle]
pub extern fn string_test() -> *mut c_char {
    let c_to_print = CString::new("Brötchen")
        .expect("CString::new failed!");
    let r = c_to_print;
    r.into_raw()  
}

C#

[DllImport(@"C:\Users\User\source\repos\testlib\target\debug\testlib.dll")]
private static extern IntPtr string_test();

public static void run()
{
    var s = string_test();
    var res = Marshal.PtrToStringAnsi(s);
    // var res = Marshal.PtrToStringUni(s);
    // var res = Marshal.PtrToStringAuto(s);
    // Are resulting in: ????n
    Console.WriteLine(res); // prints "Brötchen", expected "Brötchen"
}

How do I get the desired result?

I do not think this is a duplicate of How can I transform string to UTF-8 in C#? because its answers resulting in the same manner as Marshal.PtrToStringAuto(s) and Marshal.PtrToStringUni(s).

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
valerius21
  • 423
  • 5
  • 14
  • 4
    The [Rust FFI Omnibus](http://jakegoulding.com/rust-ffi-omnibus/string_return/) also recommends performing the conversion on the C# side. – E_net4 Feb 15 '19 at 10:48
  • 3
    There is also [`Marshal.PtrToStringUTF8`](https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.marshal.ptrtostringutf8?view=netcore-2.2). (I have no idea about C# though!) (Brötchen <3) – Lukas Kalbertodt Feb 15 '19 at 10:48
  • 1
    I don't get your edit. Can you explain what's wrong with the answer and why it doesn't suit your needs? – hellow Feb 15 '19 at 10:54
  • 1
    I don't think this question is a dupe of the linked one. This one has to go from a pointer to a string whereas the linked question already has a string. So some information is still missing and thus it's not a dupe IMO. – Lukas Kalbertodt Feb 15 '19 at 11:01
  • I think there is a very relevant comment by @Shepmaster in a deleted answer. The setup in Rust makes a satisfactory solution unnecessarily challenging. "Who is responsible for freeing the pointer returned by the Rust function? In the OP's example, ownership of the pointer is transferred from Rust to C# and it needs to be transferred back in order to properly deallocate it". – Tom Blodget Feb 18 '19 at 00:01
  • @TomBlodget I have a method that does free the string; I did not include it, because I think its irrelevant, since freeing the string is not the issue. – valerius21 Feb 18 '19 at 10:08

2 Answers2

3

The answer lies in using Marshal.PtrToStringUTF8, simplifying,

use std::ffi::CString;

#[no_mangle]
pub extern "C" fn string_test() -> *mut c_char {
    let s = CString::new("Brötchen").expect("CString::new failed!");
    s.into_raw()
}

Then C#

[DllImport(RUSTLIB)] static extern IntPtr string_test();
//...        
var encodeText = string_test();
var text = Marshal.PtrToStringUTF8(encodeText);

Console.WriteLine("Decode String : {0}", text);
Sith2021
  • 3,245
  • 30
  • 22
2

Thanks to @E_net4's comment recommending to read the Rust FFI Omnibus, I came to an answer that is rather complicated but works.

I figured that I have to rewrite the classes I am using. Furthermore, I am using the libc library and CString.

Cargo.toml

[package]
name = "testlib"
version = "0.1.0"
authors = ["John Doe <jdoe@doe.com>"]
edition = "2018"

[lib]
crate-type = ["cdylib"]

[dependencies]
libc = "0.2.48"

src/lib.rs

extern crate libc;

use libc::{c_char, uint32_t};
use std::ffi::{CStr, CString};
use std::str;

// Takes foreign C# string as input, converts it to Rust String
fn mkstr(s: *const c_char) -> String {
    let c_str = unsafe {
        assert!(!s.is_null());

        CStr::from_ptr(s)
    };

    let r_str = c_str.to_str()
        .expect("Could not successfully convert string form foreign code!");

    String::from(r_str)
}


// frees string from ram, takes string pointer as input
#[no_mangle]
pub extern fn free_string(s: *mut c_char) {
    unsafe {
        if s.is_null() { return }
        CString::from_raw(s)
    };
}

// method, that takes the foreign C# string as input, 
// converts it to a rust string, and returns it as a raw CString.
#[no_mangle]
pub extern fn result(istr: *const c_char) -> *mut c_char {
    let s = mkstr(istr);
    let cex = CString::new(s)
        .expect("Failed to create CString!");

    cex.into_raw()
}

C# Class

using System;
using System.Text;
using System.Runtime.InteropServices;


namespace Testclass
{
    internal class Native
    {
        [DllImport("testlib.dll")]
        internal static extern void free_string(IntPtr str);

        [DllImport("testlib.dll")]
        internal static extern StringHandle result(string inputstr);
    }

    internal class StringHandle : SafeHandle
    {
        public StringHandle() : base(IntPtr.Zero, true) { }

        public override bool IsInvalid
        {
            get { return false; }
        }

        public string AsString()
        {
            int len = 0;
            while (Marshal.ReadByte(handle,len) != 0) { ++len; }
            byte[] buffer = new byte[len];
            Marshal.Copy(handle, buffer, 0, buffer.Length);
            return Encoding.UTF8.GetString(buffer);
        }

        protected override bool ReleaseHandle()
        {
            Native.free_string(handle);
            return true;
        }
    }

    internal class StringTesting: IDisposable
    {
        private StringHandle str;
        private string resString;
        public StringTesting(string word)
        {
            str = Native.result(word);
        }
        public override string ToString()
        {
            if (resString == null)
            {
                resString = str.AsString();
            }
            return resString;
        }
        public void Dispose()
        {
            str.Dispose();
        }
    }

    class Testclass
    {
        public static string Testclass(string inputstr)
        {
            return new StringTesting(inputstr).ToString();
        }

        public static Main()
        {
            Console.WriteLine(new Testclass("Brötchen")); // output: Brötchen 
        }
    }
}

While this archives the desired result, I am still unsure what causes the wrong decoding in the code provided by the question.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
valerius21
  • 423
  • 5
  • 14
  • Does this answer do anything *different* from what the Omnibus shows? – Shepmaster Feb 18 '19 at 15:04
  • 1
    *I am still unsure what causes the wrong decoding* — Rust strings are UTF-8. An "ANSI" string is not UTF-8 (ANSI may be ASCII, or it might be one of a number of code pages that varies depending on system locale). In Windows world, a "Unicode" string usually means something encoded with UCS-2 or UTF-16 (similar but different), neither of which are UTF-8. Namely, they use 16-byte values. – Shepmaster Feb 18 '19 at 15:06