1

Here is the situation, I want to remove some invalid bytes(varint length of the string) in the front of the string, At first, I tried to use drain method, but as the doc says:

Panics if the starting point or end point do not lie on a char boundary, or if they're out of bounds.

So I tried to use for loop to remove the prefix


    let mut input = String::from_utf8_lossy(&[128,2,49]).into_owned();
    let len = 2;
    for _ in 0..len {
        input.remove(0);
    }

Is there any way to be more efficient?

McGrady
  • 10,869
  • 13
  • 47
  • 69

1 Answers1

2

It's unsound to ever create a String that contains invalid (i.e., non-UTF8) bytes, so you can't use String methods to deal with them. If you have a slice of u8s, some section of which is valid UTF-8, the best way to deal with it is to only convert the portion that is valid:

let raw_input: &'static [u8] = &[128, 2, 49];
let len = 2;
let input = str::from_utf8(raw_input[len..]).unwrap().to_owned();

If raw_input[len..] does not start with a valid UTF-8 character encoding, str::from_utf8() will return an Err value (which .unwrap() will turn into a panic), so this is only appropriate when you know that the UTF-8 data starts at len. This is different from the behavior of from_utf8_lossy, which converts invalid UTF-8 sequences to �, but if the string is meant to be well-formed UTF-8 except for the leading "garbage", from_utf8_lossy is not called for.

There is no need to use unsafe for this.

Also see: How do I convert a Vector of bytes (u8) to a string

trent
  • 25,033
  • 7
  • 51
  • 90
  • Thanks for your explaination, it seems that I shouldn't use String as data structure to deal with the data. – McGrady Oct 05 '19 at 14:52