16

I'm having trouble writing Vec<u16> content to a file:

use std::fs::File;
use std::io::{Write, BufWriter};
use std::mem;

#[derive(Debug, Copy, Clone, PartialEq)]
pub enum ImageFormat {
    GrayScale,
    Rgb32,
}

#[derive(Debug, Copy, Clone, PartialEq)]
pub struct ImageHeader {
    pub width: usize,
    pub height: usize,
    pub format: ImageFormat,
}

pub struct Image {
    pub header: ImageHeader,
    pub data: Vec<u16>,
}

fn write_to_file(path: &str, img: &Image) -> std::io::Result<()> {
    let f = try!(File::create(path));
    let mut bw = BufWriter::new(f);
    let slice = &img.data[..];
    println!("before length: {}", slice.len());
    let sl: &[u8];
    unsafe {
        sl = mem::transmute::<&[u16], &[u8]>(slice);
    }
    println!("after length: {}", sl.len());
    try!(bw.write_all(sl));
    return Ok(());
}

fn main() {}

Since write_all() asks for a &[u8], I'm doing an unsafe conversion of &[u16] to &[u8]. Because the conversion does not change the slice length (slice.len() and sl.len() have the same values), only half of the image data is output to the file.

It would be better if I don't need any unsafe conversion or copying.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
rillomas
  • 353
  • 2
  • 9

2 Answers2

13

To do it directly you'd want to use std::slice::from_raw_parts():

use std::{mem, slice};

fn main() {
    let slice_u16: &[u16] = &[1, 2, 3, 4, 5, 6];
    println!("u16s: {:?}", slice_u16);

    let slice_u8: &[u8] = unsafe {
        slice::from_raw_parts(
            slice_u16.as_ptr() as *const u8,
            slice_u16.len() * mem::size_of::<u16>(),
        )
    };

    println!("u8s: {:?}", slice_u8);
}

It does require unsafe because from_raw_parts() can't guarantee that you passed a valid pointer to it, and it can also create slices with arbitrary lifetimes.

See also:

This approach is not only potentially unsafe, it is also not portable. When you work with integers larger than one byte, endianness issues immediately arise. If you write a file in this way on a x86 machine, you would then read garbage on an ARM machine. The proper way is to use libraries like byteorder which allow you to specify endianness explicitly:

use byteorder::{LittleEndian, WriteBytesExt}; // 1.3.4

fn main() {
    let slice_u16: &[u16] = &[1, 2, 3, 4, 5, 6];
    println!("u16s: {:?}", slice_u16);

    let mut result: Vec<u8> = Vec::new();
    for &n in slice_u16 {
        let _ = result.write_u16::<LittleEndian>(n);
    }

    println!("u8s: {:?}", result);
}

Note that I've used Vec<u8> here, but it implements Write, and write_u16() and other methods from the WriteBytesExt trait are defined on any Write, so you could use these methods directly on a BufWriter, for example.

Once written, you can use methods from the ReadBytesExt trait to read the data back.

While this may be slightly less efficient than reinterpreting a piece of memory, it is safe and portable.

See also:

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Vladimir Matveev
  • 120,085
  • 34
  • 287
  • 296
  • Thanks! I tried your method of using `slice::from_raw_parts()` and it worked perfectly. – rillomas Jun 16 '15 at 00:14
  • @rillomas Please consider using the `byteorder` method! ARM machines are very common nowadays, not only phones are, but the new Macs too. If you are using `from_raw_parts`, files saved on a x86 machine (most Windows and desktop Linuxes) will not be readable from a phone or new Mac. – This company is turning evil. Dec 04 '20 at 20:06
  • (And I do realize this question is 5 years old, just commenting for future passerbys) – This company is turning evil. Dec 04 '20 at 20:18
  • I am new to rust. I like the first way. it is more similar to C `fwrite`. Although it might be not portable and unsafe, it is a quicker way to dump or slurp in a large temporary file when memory is limited. The second method is more portable and safe but it seems we have to do new memory allocation and convert to bytes elements by elements. Thanks for the great answer. – gbinux Jun 29 '22 at 20:37
10

I recommend using existing libraries for serialization such as serde and bincode:

extern crate bincode;
extern crate serde;
#[macro_use]
extern crate serde_derive;

use std::error::Error;

#[derive(Serialize, Deserialize, Debug, Copy, Clone, PartialEq)]
pub enum ImageFormat {
    GrayScale,
    Rgb32,
}

#[derive(Serialize, Deserialize, Debug, Copy, Clone, PartialEq)]
pub struct ImageHeader {
    pub width: usize,
    pub height: usize,
    pub format: ImageFormat,
}

#[derive(Serialize, Deserialize)]
pub struct Image {
    pub header: ImageHeader,
    pub data: Vec<u16>,
}

impl Image {
    fn save_to_disk(&self, path: &str) -> Result<(), Box<Error>> {
        use std::fs::File;
        use std::io::Write;
        let bytes: Vec<u8> = try!(bincode::serialize(self, bincode::Infinite));
        let mut file = try!(File::create(path));
        file.write_all(&bytes).map_err(|e| e.into())
    }
}

fn main() {
    let image = Image {
        header: ImageHeader {
            width: 2,
            height: 2,
            format: ImageFormat::GrayScale,
        },
        data: vec![0, 0, 0, 0],
    };

    match image.save_to_disk("image") {
        Ok(_) => println!("Saved image to disk"),
        Err(e) => println!("Something went wrong: {:?}", e.description()),
    }
}
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
A.B.
  • 15,364
  • 3
  • 61
  • 64
  • 1
    Thanks! I also tried your method of using bincode + rustc-serialize, and personally I prefer this method more than dumping raw data to a file (no `unsafe`, no need to worry about endianness). I'm accepting Vladimir's answer because it was more on topic, but like you said using serialization is probably the better way. – rillomas Jun 16 '15 at 00:26