1

I have a u8 slice that I would like to convert into a string, treating each u8 as the literal Unicode code point (that is, U+0000 to U+00FF).

The closest I've found is from_utf8 which would interpret the slice as UTF8, but I'm not after UTF8, but the literal code points instead.

How to do this?

  • Rust's [`char`s](https://doc.rust-lang.org/std/primitive.char.html) are [Unicode scalar values](http://www.unicode.org/glossary/#unicode_scalar_value); if that's good enough, you can just cast each byte to `char`. – ljedrz Dec 29 '17 at 16:17
  • Possible duplicate of [What are the options to convert ISO-8859-1 / Latin-1 to a String (UTF-8)?](https://stackoverflow.com/questions/28169745/what-are-the-options-to-convert-iso-8859-1-latin-1-to-a-string-utf-8) – trent Dec 29 '17 at 20:49
  • Sounds like you're asking for exactly a decoding of Latin-1 encoded text to a `String`. – trent Dec 29 '17 at 20:49

1 Answers1

3
fn main() {
    let codepoint_array: Vec<u8> = "test".into();
    let codepoints: Vec<char> = codepoint_array.into_iter().map(char::from).collect();
    println!("{:?}", codepoints);
}

(I have no idea why you'd want to do this, since that would give you a the Latin-1 supplement and the Latin Extended A, but nothing else...)

notriddle
  • 640
  • 4
  • 10
  • Oh, it's simple. The data I'm receiving is defined exactly like this. –  Dec 29 '17 at 17:15
  • You can also [collect into `String`](https://play.rust-lang.org/?gist=a2bfc4c1ce64658d0fec5382fa114960&version=stable). – Masaki Hara Dec 30 '17 at 01:45