3

What I did

  • Stored a UUID as BINARY(16) in NodeJS using
const uuid = Buffer.from('myEditedUuid');

(A followup to How do I fetch binary columns from MySQL in Rust?)

What I want to do

I want to fetch said UUID using Rust https://docs.rs/mysql/20.0.0/mysql/. I am currently using Vec<u8> to gain said UUID:

#[derive(Debug, PartialEq, Eq, Serialize)]
pub struct Policy {
    sub: String,
    contents: Option<String>,
}

#[derive(Debug, PartialEq, Eq, Serialize)]
pub struct RawPolicy {
    sub: Option<Vec<u8>>,
    contents: Option<String>,
}

// fetch policies themselves
let policies: Vec<RawPolicy> = connection.query_map("SELECT sub, contents FROM policy", |(sub, contents)| {
    RawPolicy { sub, contents }
},)?;

// convert uuid to string
let processed = policies.into_iter().map(|policy| {
    let sub = policy.sub.unwrap();
    let sub_string = String::from_utf8(sub).unwrap().to_string();
    Policy {
        sub: sub_string,
        contents: policy.contents,
    }
}).collect();

What my problem is

In Node, I would receive a Buffer from said database and use something like uuidBUffer.toString('utf8'); So in Rust, I try to use String::from_utf8(), but said Vec does not seem to be a valid utf8-vec:

panicked at 'called `Result::unwrap()` on an `Err` value: FromUtf8Error { bytes: [17, 234, 79, 61, 99, 181, 10, 240, 164, 224, 103, 175, 134, 6, 72, 71], error: Utf8Error { valid_up_to: 1, error_len: Some(1) } }'

My question is

Is Using Vec correct way of fetching BINARY-Columns and if so, how do I convert them back to a string?

Edit1:

Node seems to use Base 16 to Convert A string to a Buffer (Buffer.from('abcd') => <Buffer 61 62 63 64>).

Fetching my parsed UUID in Rust made With Buffer.from() gives me Vec<u8> [17, 234, 79, 61, 99, 181, 10, 240, 164, 224, 103, 175, 134, 6, 72, 71] which thows said utf8-Error. Vec does not seem to be allowed by MySQL in Rust.

  • If your UUID is actually stored as binary data and not a string representation (e.g. hex), then you have to use `Vec` on Rust side, not `String`. `String` can only deal with valid UTF-8 data. – justinas Oct 01 '20 at 08:17
  • I did, this is what the PolicyRaw Struct is for. But at some point, I have to convert it back somehow. – CodingVampyre Oct 01 '20 at 09:40
  • I think that you either need to use `from_utf8_lossy` or `from_utf16_lossy` – edkeveked Oct 01 '20 at 09:43

2 Answers2

2

Solution is simple: You need to convert the BINARY to hex at you database Query or you code. So either try Using the HEX-Crate https://docs.rs/hex/0.4.2/hex/ or rewrite your Query:

Rewriting The Query

let policies: Vec<RawPolicy> = connection.query_map("SELECT hex(sub), contents FROM policy", |(sub, contents)| {
 RawPolicy { sub, contents }
},)?;

Converts the sub to hex numbers. Now the resulting Vec can be converted using

let sub = policy.sub.unwrap();
let sub_string = String::from_utf8(sub).unwrap();
Tyrar Fox
  • 36
  • 2
0

from_utf8_lossy can be used

let input = [17, 234, 79, 61, 99, 181, 10, 240, 164, 224, 103, 175, 134, 6, 72, 71];
let output = String::from_utf8_lossy(&input); // "\u{11}�O=c�\n��g��\u{6}HG"

Invalid characters will be replaced by

The output "\u{11}�O=c�\n��g��\u{6}HG" is the same as the nodejs output "\u0011�O=c�\n��g��\u0006HG". Unless this string is to be send to a javascript runtime, it should be kept that way.

But if this string is to be send to a javascript runtime (browser or nodejs), then the unicode point notations\u{x} should be substituted to their equivalent notation in javascript

playground

from_ut16_lossy can be used as well

If some of the previous are not utf-8 encoded but utf-16, they will be converted, if not the same will be used to render them.

let input:&[u16] = &vec![17, 234, 79, 61, 99, 181, 10, 240, 164, 224, 103, 175, 134, 6, 72, 71];
println!("{}", String::from_utf16_lossy(input))

playground

edkeveked
  • 17,989
  • 10
  • 55
  • 93
  • Thanks so far! My Value looks like `"sub":"\u0011�O=c�\n��g��\u0006HG"`. So there is still some piece of the puzzle missing – CodingVampyre Oct 01 '20 at 09:45
  • you mean you rather want to have the hex encoding of invalid characters – edkeveked Oct 01 '20 at 09:48
  • Best thing would be to have the plain string again I converted in NodejS before. F.Ex If I had a string 'abcd', node Gives me `` which I can store in my `BINARY`-Column. I just want to have my `abcd` in Rust again. – CodingVampyre Oct 01 '20 at 10:35
  • this one does just that: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=a9449bccdfb10508e3b34da9adfa3ff2. Btw, "a" encoding is not 61 but 97 – edkeveked Oct 01 '20 at 10:40
  • But your UUID is not `abcd`. You can convert a `Vec` consisting of valid ASCII / UTF-8 bytes to string. You *can not* directly convert arbitrary bytes (such as a UUID) to a Rust string. – justinas Oct 01 '20 at 10:45
  • So what should I do then? @justinas – CodingVampyre Oct 01 '20 at 11:03
  • @CodingVampyre why do you want to convert to a `String`, what value do you think it would provide? Using `Vec` is the intended way to handle binary data. – justinas Oct 01 '20 at 11:21
  • it's a parsed UUID which is stored similar to https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/. I want to change it back into a string to send it to the client. – CodingVampyre Oct 01 '20 at 11:36
  • `uuid` crate has functions for formatting UUID as a string. You probably want the [hyphenated](https://docs.rs/uuid/0.8.1/uuid/struct.Uuid.html#method.to_hyphenated_ref) formatter which will format the string as hex with hyphens inbetween. – justinas Oct 01 '20 at 12:03