7

I want to return a vector in a pub extern "C" fn. Since a vector has an arbitrary length, I guess I need to return a struct with

  1. the pointer to the vector, and

  2. the number of elements in the vector

My current code is:

extern crate libc;
use self::libc::{size_t, int32_t, int64_t};

// struct to represent an array and its size
#[repr(C)]
pub struct array_and_size {
    values: int64_t, // this is probably not how you denote a pointer, right?
    size: int32_t,
}

// The vector I want to return the address of is already in a Boxed struct, 
// which I have a pointer to, so I guess the vector is on the heap already. 
// Dunno if this changes/simplifies anything?
#[no_mangle]
pub extern "C" fn rle_show_values(ptr: *mut Rle) -> array_and_size {
    let rle = unsafe {
        assert!(!ptr.is_null());
        &mut *ptr
    };

    // this is the Vec<i32> I want to return 
    // the address and length of
    let values = rle.values; 
    let length = values.len();

    array_and_size {
       values: Box::into_raw(Box::new(values)),
       size: length as i32,
       }
}

#[derive(Debug, PartialEq)]
pub struct Rle {
    pub values: Vec<i32>,
}

The error I get is

$ cargo test
   Compiling ranges v0.1.0 (file:///Users/users/havpryd/code/rust-ranges)
error[E0308]: mismatched types
  --> src/rle.rs:52:17
   |
52 |         values: Box::into_raw(Box::new(values)),
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected i64, found *-ptr
   |
   = note: expected type `i64`
   = note:    found type `*mut std::vec::Vec<i32>`

error: aborting due to previous error

error: Could not compile `ranges`.

To learn more, run the command again with --verbose.
-> exit code: 101

I posted the whole thing because I could not find an example of returning arrays/vectors in the eminently useful Rust FFI Omnibus.

Is this the best way to return a vector of unknown size from Rust? How do I fix my remaining compile error? Thanks!

Bonus q: if the fact that my vector is in a struct changes the answer, perhaps you could also show how to do this if the vector was not in a Boxed struct already (which I think means the vector it owns is on the heap too)? I guess many people looking up this q will not have their vectors boxed already.

Bonus q2: I only return the vector to view its values (in Python), but I do not want to let the calling code change the vector. But I guess there is no way to make the memory read-only and ensure the calling code does not fudge with the vector? const is just for showing intent, right?

Ps: I do not know C or Rust well, so my attempt might be completely WTF.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
The Unfun Cat
  • 29,987
  • 31
  • 114
  • 156
  • 2
    Vec to Array conversion is [covered here](http://stackoverflow.com/a/37682288/147192), so you might want to focus on the first question (vector of unknown size instead). However this question does not match the title, so it's confusing. – Matthieu M. Oct 20 '16 at 14:01
  • 1
    I'm not an expert on FFI, but if you can go with `values: Box<[i32]>` in your `array_and_size`, you could just convert the relevant `Vec`tors to it using `into_boxed_slice()`. – ljedrz Oct 20 '16 at 14:03
  • The compiler did not complain when I set `values: Box<[i32]>` but I guess Rust must convert this into a valid C struct somehow, because C does not use Boxes I guess. – The Unfun Cat Oct 20 '16 at 14:06
  • 1
    C doesn't **have** fixed length arrays, that's why it's not covered in the Omnibus. – Shepmaster Oct 20 '16 at 14:06
  • 3
    @Shepmaster: Note that the OP is using a fixed-length array *within a `struct`*, which is perfectly valid C. What C does not allow is receiving arrays as parameters or returning them from functions. – Matthieu M. Oct 20 '16 at 14:09
  • 2
    @ljedrz: C does not have `Box`, but from `Box` you can further convert to a raw pointer. Of course freeing that memory back afterward is another story altogether. – Matthieu M. Oct 20 '16 at 14:10
  • @MatthieuM. hmm, interesting point! I don't know if I'd trust that it correctly crosses FFI, as a Rust array would have the size baked-in (right?) and the C declaration would be closer to "allocate a sequential blob of memory" without the size... experimentation is needed! – Shepmaster Oct 20 '16 at 14:14
  • 1
    @Shepmaster: `[i32; 3]` has the size baked in as part of its type (much like in C), so there is no overhead. You are thinking of `[i32]` which is slightly different (though still named array, I think). – Matthieu M. Oct 20 '16 at 14:15
  • @TheUnfunCat: You don't have to delete the question, but maybe some editing to focus on what you really need (the root problem) would let people more leeway for the best answer rather than artificially constraining it. – Matthieu M. Oct 20 '16 at 14:18
  • @MatthieuM. I was mostly thinking of how the slice can be built for free from an array, thinking there must be a size in there. – Shepmaster Oct 20 '16 at 14:32
  • I think I have simplified my question so that it is much easier to answer and more useful for others to look up. Thanks for the feedback! Edit: Made further changes. – The Unfun Cat Oct 21 '16 at 06:30

2 Answers2

6
pub struct array_and_size {
    values: int64_t, // this is probably not how you denote a pointer, right?
    size: int32_t,
}

First of all, you're correct. The type you want for values is *mut int32_t.

In general, and note that there are a variety of C coding styles, C often doesn't "like" returning ad-hoc sized array structs like this. The more common C API would be

int32_t rle_values_size(RLE *rle);
int32_t *rle_values(RLE *rle);

(Note: many internal programs do in fact use sized array structs, but this is by far the most common for user-facing libraries because it's automatically compatible with the most basic way of representing arrays in C).

In Rust, this would translate to:

extern "C" fn rle_values_size(rle: *mut RLE) -> int32_t
extern "C" fn rle_values(rle: *mut RLE) -> *mut int32_t

The size function is straightforward, to return the array, simply do

extern "C" fn rle_values(rle: *mut RLE) -> *mut int32_t {
    unsafe { &mut (*rle).values[0] }
}

This gives a raw pointer to the first element of the Vec's underlying buffer, which is all C-style arrays really are.

If, instead of giving C a reference to your data you want to give C the data, the most common option would be to allow the user to pass in a buffer that you clone the data into:

extern "C" fn rle_values_buf(rle: *mut RLE, buf: *mut int32_t, len: int32_t) {
    use std::{slice,ptr}
    unsafe {
        // Make sure we don't overrun our buffer's length
        if len > (*rle).values.len() {
           len = (*rle).values.len()
        }
        ptr::copy_nonoverlapping(&(*rle).values[0], buf, len as usize);
    }
}

Which, from C, looks like

void rle_values_buf(RLE *rle, int32_t *buf, int32_t len);

This (shallowly) copies your data into the presumably C-allocated buffer, which the C user is then responsible for destroying. It also prevents multiple mutable copies of your array from floating around at the same time (assuming you don't implement the version that returns a pointer).

Note that you could sort of "move" the array into C as well, but it's not particularly recommended and involves the use mem::forget and expecting the C user to explicitly call a destruction function, as well as requiring both you and the user to obey some discipline that may be difficult to structure the program around.

If you want to receive an array from C, you essentially just ask for both a *mut i32 and i32 corresponding to the buffer start and length. You can assemble this into a slice using the from_raw_parts function, and then use the to_vec function to create an owned Vector containing the values allocated from the Rust side. If you don't plan on needing to own the values, you can simply pass around the slice you produced via from_raw_parts.

However, it is imperative that all values be initialized from either side, typically to zero. Otherwise you invoke legitimately undefined behavior which often results in segmentation faults (which tend to frustratingly disappear when inspected with GDB).

Linear
  • 21,074
  • 4
  • 59
  • 70
4

There are multiple ways to pass an array to C.


First of all, while C has the concept of fixed-size arrays (int a[5] has type int[5] and sizeof(a) will return 5 * sizeof(int)), it is not possible to directly pass an array to a function or return an array from it.

On the other hand, it is possible to wrap a fixed size array in a struct and return that struct.

Furthermore, when using an array, all elements must be initialized, otherwise a memcpy technically has undefined behavior (as it is reading from undefined values) and valgrind will definitely report the issue.


Using a dynamic array

A dynamic array is an array whose length is unknown at compile-time.

One may chose to return a dynamic array if no reasonable upper-bound is known, or this bound is deemed too large for passing by value.

There are two ways to handle this situation:

  • ask C to pass a suitably sized buffer
  • allocate a buffer and return it to C

They differ in who allocates the memory: the former is simpler, but may require to either have a way to hint at a suitable size or to be able to "rewind" if the size proves unsuitable.

Ask C to pass a suitable sized buffer

// file.h
int rust_func(int32_t* buffer, size_t buffer_length);

// file.rs
#[no_mangle]
pub extern fn rust_func(buffer: *mut libc::int32_t, buffer_length: libc::size_t) -> libc::c_int {
    // your code here
}

Note the existence of std::slice::from_raw_parts_mut to transform this pointer + length into a mutable slice (do initialize it with 0s before making it a slice or ask the client to).

Allocate a buffer and return it to C

// file.h
struct DynArray {
    int32_t* array;
    size_t length;
}

DynArray rust_alloc();
void rust_free(DynArray);

// file.rs
#[repr(C)]
struct DynArray {
    array: *mut libc::int32_t,
    length: libc::size_t,
}

#[no_mangle]
pub extern fn rust_alloc() -> DynArray {
    let mut v: Vec<i32> = vec!(...);

    let result = DynArray {
        array: v.as_mut_ptr(),
        length: v.len() as _,
    };

    std::mem::forget(v);

    result
}

#[no_mangle]
pub extern fn rust_free(array: DynArray) {
    if !array.array.is_null() {
        unsafe { Box::from_raw(array.array); }
    }
}

Using a fixed-size array

Similarly, a struct containing a fixed size array can be used. Note that both in Rust and C all elements should be initialized, even if unused; zeroing them works well.

Similarly to the dynamic case, it can be either passed by mutable pointer or returned by value.

// file.h
struct FixedArray {
    int32_t array[32];
};

// file.rs
#[repr(C)]
struct FixedArray {
    array: [libc::int32_t; 32],
}
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • I updated your rust_alloc to take a pointer to my `Rle` struct, from which I fetch the `Rle.values` vector. Problem is, I get an error in the line `array: values.into_boxed_slice().into_raw() as *mut _,`. I think this has to do with the fact that my Rle is boxed, so that the vector it contains also is. Anyways, this is the error: `error: no method named `into_raw` found for type `Box<[i32]>` in the current scope`. This might be a completely different problem: should I ask a new q about this, or is it trivial to fix? – The Unfun Cat Oct 21 '16 at 12:49
  • @TheUnfunCat: Hitting the same error... I did not expect it given [the signature](https://doc.rust-lang.org/src/alloc/up/src/liballoc/boxed.rs.html#266) and the fact that `T: ?Sized` in the bounds. I must be missing something... – Matthieu M. Oct 21 '16 at 13:04
  • Okay, so it has nothing to do with my vector being in a Boxed struct then, since you are just using a regular vector. I'm sure it is just a minor tweak that is needed :) – The Unfun Cat Oct 21 '16 at 13:06
  • Perhaps add a warning to the top that this isn't working just yet? Or perhaps ask a q about why? No shame in asking... – The Unfun Cat Oct 21 '16 at 13:28
  • @TheUnfunCat: I am going to ask, I reduced the problem as much as I could and it still is persisting... and I have no idea why. In the mean time I switched to teasing the vector apart, which compiles (at least), so you don't have to wait for it ;) – Matthieu M. Oct 21 '16 at 13:41
  • @TheUnfunCat: Found it! One must call Box::into_raw(...) because into_raw was NOT defined with self as a parameter. (Why? no idea) – Matthieu M. Oct 21 '16 at 13:53