3

If String is actually

pub struct String {
    vec: Vec<u8>,
}

Then why is there a special syntax (&str) for a slice of a Vec<u8>? In Chapter 3 of "Programming Rust" by Jim Blandy & Jason Orendorff it says,

&str is very much like &[T]: a fat pointer to some data. String is analogous to Vec<T>

Following that statement there is a chart which shows all the ways they're similar, but there isn't any mention of a single method that they're different. Is a &str; just a &[T]?

Likewise in the answer to, What are the differences between Rust's String and str? it says

This is identical to the relationship between a vector Vec<T> and a slice &[T], and is similar to the relationship between by-value T and by-reference &T for general types.

That question focuses on the difference between String and &str. Knowing that a String really is a vector of u8, I'm more interested in &str, which I can't even find the source to. Why does this primitive even exist when we have a primitive (implemented as a fat pointer) for regular vector slices?

Evan Carroll
  • 78,363
  • 46
  • 261
  • 468
  • 1
    I get the feeling you don't quite believe that `str` is to `String` as `[T]` is to `Vec`. `[T]` is built-in, just like `str` -- you wouldn't be able to find the source for `[T]` in the standard library. But you seem to be confused about `String`, and not about `Vec` -- can you articulate why? – trent Aug 14 '19 at 15:50
  • your question could be why not use `*mut u8`... This is basic object concept. String is a mutable UTF-8, `str` is a no mutable UTF-8. It's two type because it's represent two different thing. If we use `Vec` what guarantee would we have that it's a utf-8 string ? none. The user would have to know that. That the C way; – Stargateur Aug 14 '19 at 16:02
  • @Stargateur to which the answer would be: because *pointer* dereferencing is impossible in safe rust. –  Aug 16 '19 at 19:50

2 Answers2

9

It exists for the same reason that String exists, and we don't just pass around Vec<u8> for every string.

  • A String is an owned, growable container of data that is guaranteed to be UTF-8.
  • &str is a borrowed, fixed-length container of data that is guaranteed to be UTF-8
  • A Vec<u8> is an owned, growable container of u8.
  • &[u8] is a borrowed, fixed-length container of u8.

This is effectively the reason that types exist, period — to provide abstraction and guarantees (a.k.a. restrictions) on a looser blob of bits.

If we had access to the string as &mut [u8], then we could trivially ruin the UTF-8 guarantee, which is why all such methods are marked as unsafe. Even with an immutable &[u8], we wouldn't be able to make assumptions (a.k.a. optimizations) about the data and would have to write much more defensive code everywhere.

but there isn't any mention of a single method that they're different

Looking at the documentation for str and slice quickly shows a number of methods that exist on one that don't exist on the other, so I don't understand your statement. split_last is the first one that caught my eye, for example.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • Could you add to your 4 points above another pair of points for `Vector` vs `[char]` for clarity? Perhaps that would help me refine my question. I didn't compute that `&str` is a strongly typed pointer to UTF-8, whereas `String` is just a type with the constraint in it's constructor and that we would need the ability to differentiate `&mut [u8]` from a subtype that has to be constructed as UTF-8. – Evan Carroll Aug 14 '19 at 15:14
  • 1
    @EvanCarroll a `Vec` / `&[char]` don't have any real relation to `String` / `str`. At best, they would be a UTF-32 string, but generally there's no use for such a type. *`String` [...] constraint in it's constructor* — and every mutating method. – Shepmaster Aug 14 '19 at 15:44
7

&str is not necessarily a view to a String, it can be a view to anything that is a valid UTF-8 string.

For example, the crate arraystring allows creating a string on the stack that can be viewed as a &str.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Boiethios
  • 38,438
  • 19
  • 134
  • 183