104

I'm trying to get a C string returned by a C library and convert it to a Rust string via FFI.

mylib.c

const char* hello(){
    return "Hello World!";
}

main.rs

#![feature(link_args)]

extern crate libc;
use libc::c_char;

#[link_args = "-L . -I . -lmylib"]
extern {
    fn hello() -> *c_char;
}

fn main() {
    //how do I get a str representation of hello() here?
}
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Dirk
  • 2,094
  • 3
  • 25
  • 28

2 Answers2

160

The best way to work with C strings in Rust is to use structures from the std::ffi module, namely CStr and CString.

CStr is a dynamically sized type and so it can only be used through a pointer. This makes it very similar to the regular str type. You can construct a &CStr from *const c_char using an unsafe CStr::from_ptr static method. This method is unsafe because there is no guarantee that the raw pointer you pass to it is valid, that it really does point to a valid C string and that the string's lifetime is correct.

You can get a &str from a &CStr using its to_str() method.

Here is an example:

extern crate libc;

use libc::c_char;
use std::ffi::CStr;
use std::str;

extern {
    fn hello() -> *const c_char;
}

fn main() {
    let c_buf: *const c_char = unsafe { hello() };
    let c_str: &CStr = unsafe { CStr::from_ptr(c_buf) };
    let str_slice: &str = c_str.to_str().unwrap();
    let str_buf: String = str_slice.to_owned();  // if necessary
}

You need to take into account the lifetime of your *const c_char pointers and who owns them. Depending on the C API, you may need to call a special deallocation function on the string. You need to carefully arrange conversions so the slices won't outlive the pointer. The fact that CStr::from_ptr returns a &CStr with arbitrary lifetime helps here (though it is dangerous by itself); for example, you can encapsulate your C string into a structure and provide a Deref conversion so you can use your struct as if it was a string slice:

extern crate libc;

use libc::c_char;
use std::ops::Deref;
use std::ffi::CStr;

extern "C" {
    fn hello() -> *const c_char;
    fn goodbye(s: *const c_char);
}

struct Greeting {
    message: *const c_char,
}

impl Drop for Greeting {
    fn drop(&mut self) {
        unsafe {
            goodbye(self.message);
        }
    }
}

impl Greeting {
    fn new() -> Greeting {
        Greeting { message: unsafe { hello() } }
    }
}

impl Deref for Greeting {
    type Target = str;

    fn deref<'a>(&'a self) -> &'a str {
        let c_str = unsafe { CStr::from_ptr(self.message) };
        c_str.to_str().unwrap()
    }
}

There is also another type in this module called CString. It has the same relationship with CStr as String with str - CString is an owned version of CStr. This means that it "holds" the handle to the allocation of the byte data, and dropping CString would free the memory it provides (essentially, CString wraps Vec<u8>, and it's the latter that will be dropped). Consequently, it is useful when you want to expose the data allocated in Rust as a C string.

Unfortunately, C strings always end with the zero byte and can't contain one inside them, while Rust &[u8]/Vec<u8> are exactly the opposite thing - they do not end with zero byte and can contain arbitrary numbers of them inside. This means that going from Vec<u8> to CString is neither error-free nor allocation-free - the CString constructor both checks for zeros inside the data you provide, returning an error if it finds some, and appends a zero byte to the end of the byte vector which may require its reallocation.

Like String, which implements Deref<Target = str>, CString implements Deref<Target = CStr>, so you can call methods defined on CStr directly on CString. This is important because the as_ptr() method that returns the *const c_char necessary for C interoperation is defined on CStr. You can call this method directly on CString values, which is convenient.

CString can be created from everything which can be converted to Vec<u8>. String, &str, Vec<u8> and &[u8] are valid arguments for the constructor function, CString::new(). Naturally, if you pass a byte slice or a string slice, a new allocation will be created, while Vec<u8> or String will be consumed.

extern crate libc;

use libc::c_char;
use std::ffi::CString;

fn main() {
    let c_str_1 = CString::new("hello").unwrap(); // from a &str, creates a new allocation
    let c_str_2 = CString::new(b"world" as &[u8]).unwrap(); // from a &[u8], creates a new allocation
    let data: Vec<u8> = b"12345678".to_vec(); // from a Vec<u8>, consumes it
    let c_str_3 = CString::new(data).unwrap();

    // and now you can obtain a pointer to a valid zero-terminated string
    // make sure you don't use it after c_str_2 is dropped
    let c_ptr: *const c_char = c_str_2.as_ptr();

    // the following will print an error message because the source data
    // contains zero bytes
    let data: Vec<u8> = vec![1, 2, 3, 0, 4, 5, 0, 6];
    match CString::new(data) {
        Ok(c_str_4) => println!("Got a C string: {:p}", c_str_4.as_ptr()),
        Err(e) => println!("Error getting a C string: {}", e),
    }  
}

If you need to transfer ownership of the CString to C code, you can call CString::into_raw. You are then required to get the pointer back and free it in Rust; the Rust allocator is unlikely to be the same as the allocator used by malloc and free. All you need to do is call CString::from_raw and then allow the string to be dropped normally.

unixia
  • 4,102
  • 1
  • 19
  • 23
Vladimir Matveev
  • 120,085
  • 34
  • 287
  • 296
  • Great answer, this helped me big time. Does the unsafety in lifetime of the cstr still exist when interfacing with a GC lang like c#? – scape Feb 17 '17 at 20:24
  • @scape yes, of course, it does. I'd say it is even more important there, because garbage collection may run at any time, especially if it is concurrent. If you do not take care to keep the string on the GC side rooted somewhere, you may suddenly access a freed piece of memory on the Rust side. – Vladimir Matveev Feb 18 '17 at 11:00
7

In addition to what @vladimir-matveev has said, you can also convert between them without the aid of CStr or CString:

#![feature(link_args)]

extern crate libc;
use libc::{c_char, puts, strlen};
use std::{slice, str};

#[link_args = "-L . -I . -lmylib"]
extern "C" {
    fn hello() -> *const c_char;
}

fn main() {
    //converting a C string into a Rust string:
    let s = unsafe {
        let c_s = hello();
        str::from_utf8_unchecked(slice::from_raw_parts(c_s as *const u8, strlen(c_s)+1))
    };
    println!("s == {:?}", s);
    //and back:
    unsafe {
        puts(s.as_ptr() as *const c_char);
    }
}

Just make sure that when converting from a &str to a C string, your &str ends with '\0'. Notice that in the code above I use strlen(c_s)+1 instead of strlen(c_s), so s is "Hello World!\0", not just "Hello World!".
(Of course in this particular case it works even with just strlen(c_s). But with a fresh &str you couldn't guarantee that the resulting C string would terminate where expected.)
Here's the result of running the code:

s == "Hello World!\u{0}"
Hello World!
Des Nerger
  • 167
  • 2
  • 5
  • 1
    You can convert *from* without `CStr`, but avoiding it has no reason. Your converting back is *incorrect* as a Rust `&str` is not NUL-terminated, thus isn't a valid C string. – Shepmaster Jan 19 '18 at 14:31
  • @Shepmaster, Yes, a Rust &str is generally not NUL-terminated but since this one was made from a C string, it works fine when you do `s.as_ptr()`. To make it more clear I've now corrected `strlen(c_s)` to `strlen(c_s)+1`. – Des Nerger Jan 19 '18 at 14:58
  • 1
    So now you've replicated functionality from the standard library? Please [edit] your question to explain to future readers why they should pick this solution as opposed to the existing answer. – Shepmaster Jan 19 '18 at 15:00
  • 4
    One reason to do so is that you're developing in a no_std environment. – Myk Melez Aug 02 '19 at 16:59
  • 1
    If you need CStr in a no_std environment, the https://github.com/Amanieu/cstr_core crate is a good choice. The only shortcoming is that it depends on cty which has an open merge request to fix AVR support. – Mutant Bob Feb 07 '22 at 15:48
  • This is most simple solution to convert the C string into RUST str. – Sunding Wei Jul 29 '23 at 14:33