I will try to give a different perspective. In Rust there is a general convention: if you have a variable of some type T
, it means that you own the data associated with T
. If you have a variable of type &T
, then you don't own the data.
Now let's consider a heap-allocated string. According to this convention, there should be a non-reference type that represents ownership of the allocation. And indeed such a type exists: String
.
There is also a different kind of strings: &'static str
. These strings are not owned by anyone: exactly one instance of string is placed inside the compiled binary file, and only pointers are passed around. There is no allocation and no deallocation, hence no ownership. In a sense, static strings are owned by the compiler, not by a programmer. This is why String
can not be used to represent a static string.
Alright, so why not use &String
to represent a static string? Imagine a world where the following code is a valid Rust:
let s: &'static String = "hello, world!";
This might look fine, but implementation-wise, this is suboptimal:
String
itself has a pointer to the actual data, so &String
has to be basically a pointer to a pointer. This violates zero-cost abstraction principle: why do we introduce an excessive level of indirection, when actually the compiler statically knows the address of "hello, world!"
?
Even if somehow the compiler was smart enough to decide that an excessive pointer is not needed here (which would lead to a bunch of other problems), still String
itself contains three 8-byte fields:
- Data pointer;
- Data length;
- Allocation capacity - lets us know how much free space there is after the data.
However, when we are talking about static strings, capacity makes zero sense: static strings are read-only.
So, in the end, when the compiler sees &'static String
, we actually want it to store only a data pointer and length - otherwise, we are paying for what we will never use, which is against zero-cost abstraction principle. This looks like an arcane wizardry that we want from the compiler: the variable type is &String
but the variable itself is anything but a reference to String
.
To make this work, we actually need a different type, not &String
, that only holds a data pointer and length. And here it is: &str
! It is better than &String
in a number of ways:
- Does not have an excessive level of indirection - only one pointer;
- Does not store capacity, which would be meaningless in many contexts;
- No black magic: we define
str
as a variable-sized type (the data itself), so &str
is just a reference to the data.
Now you might wonder: why not introduce str
instead of &str
? Remeber the convention: having str
would imply that you own the data, which you don't. Hence &str
.