1

I'm very new to rust. How I can to get UTF-8 index of char symbol in Rust.

Here you have a reference to utf table.

let bracket = '[';

fn get_utf(c:&[u8])->&str{
  // don't know how to obtain utf index
}

let result = get_utf(bracket); // 005B

I tried this function, but it does not work in a way I expect.

This crate might be useful, but I don't know how to use it.

Sorry, not much of my effort.

kometen
  • 6,536
  • 6
  • 41
  • 51
  • 1
    UTF-8 is a variable length multi-byte encoding for all symbols ("code points") of all scripts in Unicode, a numbering of code points. For 7bits UTF-8 == ASCII, single byte, and '[' casted to int is 0x5B == 91. – Joop Eggen Jan 24 '21 at 18:03

1 Answers1

5

A char in Rust represents a Unicode scalar value. You can use as to cast it to a u32:

let bracket = '[';
let result = bracket as u32;

println!("{:04X}", result); // prints "005B"

See also: How to get a char's unicode value?

kmdreko
  • 42,554
  • 6
  • 57
  • 106
  • 4
    Since OP is new to Rust and wants to assign value into a variable, this might support the answer : `let result = format!("{:04X}", bracket as u32);` – Ömer Erden Jan 24 '21 at 18:22
  • 1
    I edited it to be a little more explicit. Its not clear to me what OP wants to use it for, so I don't know whether it should be an integer or a string. But I hope the example helps regardless. – kmdreko Jan 24 '21 at 18:27