6

I have a Vec<u8> of bytes read from a file - the bytes are a text format (probably UTF-16 or some other silly 2 byte format) and I want to turn it into UTF-8.

let title = Vec::from_iter(bytes.take(title_length));
// Some Vec<u8> to &[u16] magic
let title = String::from_utf16_lossy(title);

Currently I'm using this rather dirty code:

let title: &[u16] = unsafe { std::slice::from_raw_parts(title_data.as_ptr(), title_data.len()) };

While this should work I'm getting errors probably due to the take() call:

error: mismatched types:
 expected `*const u16`,
    found `*const core::result::Result<u8, std::io::error::Error>`
(expected u16,
    found enum `core::result::Result`) [E0308]

Should I map the take iterator or something?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
J V
  • 11,402
  • 10
  • 52
  • 72
  • That would've been a very good question, didn't it look incomplete. ;) – E_net4 Mar 27 '16 at 20:12
  • That's weird. It was giving me issues while saving it too. Hangon a second :/ – J V Mar 27 '16 at 20:21
  • Please provide an [MCVE](/help/mcve) of your problem. One line of your code references `title_data`, but that's never defined anywhere. Ideally, provide code that reproduces the error on [the Rust Playground](https://play.rust-lang.org/) – Shepmaster Mar 27 '16 at 21:13

3 Answers3

8

With Safe Code

Just in case you need to do it safely,

let title = Vec::from_iter(bytes.take(title_length));
let title: Vec<u16> = title
    .chunks_exact(2)
    .into_iter()
    .map(|a| u16::from_ne_bytes([a[0], a[1]]))
    .collect();
let title = title.as_slice();
let title = String::from_utf16_lossy(title);

Note that this will allocate memory, and do an extra copy (which the unsafe alternatives don't do).

AldaronLau
  • 1,044
  • 11
  • 15
1

In the end I mapped unwrap over the iterator, though I'm still confused as to why an iterator needs to consist of results.

let title_data = Vec::from_iter(bytes.take(title_length).map(|x| x.unwrap()));
let title: &[u16] = unsafe {
    std::slice::from_raw_parts(title_data.as_ptr() as *const u16, title_data.len() / 2)
};
let title = String::from_utf16_lossy(title);
J V
  • 11,402
  • 10
  • 52
  • 72
  • 3
    because it's an IO iterator and IO operations can fail. Would you prefer your application to quit with a panic just because some file could no longer be read? – llogiq Mar 28 '16 at 12:46
0

There are two errors. First, you need to .unwrap() your Result (of from_raw_parts(..), I presume), second the length is too large, because a u16 takes twice the space of a u8, so you need to divide by 2.

llogiq
  • 13,815
  • 8
  • 40
  • 72
  • Unfortunately, I can't unwrap the result - `from_raw_parts` is giving me a slice *containing* results: `error: no method named 'unwrap' found for type '&[core::result::Result]' in the current scope` – J V Mar 28 '16 at 08:23
  • 1
    See, that's why you should have written a complete example. I can only guess that `bytes` returns `Option>`s. – llogiq Mar 28 '16 at 12:24
  • It doesn't. [It returns `std::iter::Take`](https://doc.rust-lang.org/std/io/struct.Bytes.html#method.take) – J V Mar 28 '16 at 12:35