55

I want to send my struct via a TcpStream. I could send String or u8, but I can not send an arbitrary struct. For example:

struct MyStruct {
    id: u8,
    data: [u8; 1024],
}

let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = convert_struct(my_struct); // how??
tcp_stream.write(bytes);

After receiving the data, I want to convert &[u8] back to MyStruct. How can I convert between these two representations?

I know Rust has a JSON module for serializing data, but I don't want to use JSON because I want to send data as fast and small as possible, so I want to no or very small overhead.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
agatana
  • 675
  • 1
  • 5
  • 6

3 Answers3

60

A correctly sized struct as zero-copied bytes can be done using stdlib and a generic function.

In the example below there there is a reusable function called any_as_u8_slice instead of convert_struct, since this is a utility to wrap cast and slice creation.

Note that the question asks about converting, this example creates a read-only slice, so has the advantage of not needing to copy the memory.

Heres a working example based on the question:

unsafe fn any_as_u8_slice<T: Sized>(p: &T) -> &[u8] {
    ::core::slice::from_raw_parts(
        (p as *const T) as *const u8,
        ::core::mem::size_of::<T>(),
    )
}

fn main() {
    struct MyStruct {
        id: u8,
        data: [u8; 1024],
    }
    let my_struct = MyStruct { id: 0, data: [1; 1024] };
    let bytes: &[u8] = unsafe { any_as_u8_slice(&my_struct) };
    // tcp_stream.write(bytes);
    println!("{:?}", bytes);
}

Note 1) even though 3rd party crates might be better in some cases, this is such a primitive operation that its useful to know how to do in Rust.

Note 2) at time of writing (Rust 1.15), there is no support for const functions. Once there is, it will be possible to cast into a fixed sized array instead of a slice.

Note 3) the any_as_u8_slice function is marked unsafe because any padding bytes in the struct may be uninitialized memory (giving undefined behavior). If there were a way to ensure input arguments used only structs which were #[repr(packed)], then it could be safe.

Otherwise the function is fairly safe since it prevents buffer over-run since the output is read-only, fixed number of bytes, and its lifetime is bound to the input.
If you wanted a version that returned a &mut [u8], that would be quite dangerous since modifying could easily create inconsistent/corrupt data.

antonok
  • 516
  • 6
  • 9
ideasman42
  • 42,413
  • 44
  • 197
  • 320
  • What's the difference between this solution and one which uses transmutes? Is there a reason to prefer one over the other? https://doc.rust-lang.org/nomicon/transmutes.html – Jason Dreyzehner Jun 21 '18 at 02:27
  • 1
    As stated in the answer: *"it prevents buffer over-run since the output is read-only, fixed number of bytes, and its lifetime is bound to the input."* – ideasman42 Jun 21 '18 at 09:44
  • 18
    Is there a way to go in the opposite direction, i.e. convert the bytes back to the struct? – Lev Dec 06 '18 at 07:10
  • 4
    @Lev `let s: MyStruct = unsafe { std::mem::transmute(*bytes) };` – d9ngle Sep 09 '21 at 13:48
  • Amazing answer, thanks! Only one comment on Note 3: the function is unsafe because you are using an arbitrary piece of memory to build a slice (that's why from_raw_parts is unsafe), not because of the padding bytes. When you have the byte array, padding bytes are just normal bytes and when you have the struct, they are not accessible at all. – ASLLOP Nov 08 '21 at 12:38
  • `let s: MyStruct = unsafe { std::mem::transmute(*bytes) };` works, but I think `let p: *const [u8; std::mem::size_of::()] = bytes as *const [u8; std::mem::size_of::()]; let s: MyStruct = unsafe { std::mem::transmute(*p) };` is also good. – lechat Feb 12 '22 at 07:57
  • Going the other way (from bytes to struct) without a transmute: `unsafe fn u8_slice_as_any(p: &[u8]) -> &T { assert_eq!(p.len(), ::core::mem::size_of::()); &*(p.as_ptr() as *const T) }` – Jasha Feb 15 '23 at 14:18
29

(Shamelessly stolen and adapted from Renato Zannon's comment on a similar question)

Perhaps a solution like bincode would suit your case? Here's a working excerpt:

Cargo.toml

[package]
name = "foo"
version = "0.1.0"
authors = ["An Devloper <an.devloper@example.com>"]
edition = "2018"

[dependencies]
bincode = "1.0"
serde = { version = "1.0", features = ["derive"] }

main.rs

use serde::{Deserialize, Serialize};
use std::fs::File;

#[derive(Serialize, Deserialize)]
struct A {
    id: i8,
    key: i16,
    name: String,
    values: Vec<String>,
}

fn main() {
    let a = A {
        id: 42,
        key: 1337,
        name: "Hello world".to_string(),
        values: vec!["alpha".to_string(), "beta".to_string()],
    };

    // Encode to something implementing `Write`
    let mut f = File::create("/tmp/output.bin").unwrap();
    bincode::serialize_into(&mut f, &a).unwrap();

    // Or just to a buffer
    let bytes = bincode::serialize(&a).unwrap();
    println!("{:?}", bytes);
}

You would then be able to send the bytes wherever you want. I assume you are already aware of the issues with naively sending bytes around (like potential endianness issues or versioning), but I'll mention them just in case ^_^.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • 9
    Worth noting that this isn't a direct conversion, while the encoding/decoding uses a binary format, this is not simply accessing the struct's memory (which may be seen as both a good and a bad thing) depending on what you want, it's performing some conversions. Bincode also does endian conversion for example. – ideasman42 Feb 19 '17 at 00:56
  • 2
    Use `bincode` when you don't care about performance. https://www.reddit.com/r/rust/comments/eg9cfm/it_seems_bincode_is_surprisingly_slow/ – محمد جعفر نعمة Oct 12 '22 at 10:52
  • This requires you to have your struct derive the Serialize trait, so it's not always possible for structs not defined in your crate. – jaques-sam Jun 27 '23 at 10:37
3

You can use the bytemuck crate to do it safely:

#[derive(bytemuck::NoUninit, Clone, Copy)]
#[repr(C)]
struct MyStruct {
    id: u8,
    data: [u8; 1024],
}

let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = bytemuck::bytes_of(&my_struct);
tcp_stream.write(bytes);

Note this requires the struct to be Copy and #[repr(C)] or #[repr(transparent)].

Chayim Friedman
  • 47,971
  • 5
  • 48
  • 77