-5

Let's take this example:

fn hello() -> String {
    String::from("Hello")
}

An equivalent example in C would be to use malloc, write chars into the memory, and to return the pointer. Why does the Rust code work? Why don't I have to write it like this:

fn hello() -> Box<String> {
    Box::from(String::from("Hello"))
}

To create a value inside a function and let that function return it, the value must always be created on the heap and not the stack, that's for sure. Why does Rust use a syntax that indicates on first glance you can return stack variables?

Does Rust automatically wrap all data returned from a function in a Box? Or is there some other Rust magic?

My guess is that it's syntactic sugar to not to be forced to wrap return values in a Box.

My question is not limited to strings; it's about returning structures from functions in general. I know that the underlying vector stores its data on the heap. This question is only about the metadata of the struct (i.e. pointers to the data on the heap) and how they are returned (in compiler output) when the function returns.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
phip1611
  • 5,460
  • 4
  • 30
  • 57
  • just reading the doc, https://doc.rust-lang.org/std/string/struct.String.html#representation – Stargateur Feb 09 '20 at 12:39
  • My question was not just about strings. It was about returning structures from functions in general. I edited the question. – phip1611 Feb 09 '20 at 12:43
  • just read the doc, https://doc.rust-lang.org/stable/book/ch04-01-what-is-ownership.html?highlight=move#return-values-and-scope – Stargateur Feb 09 '20 at 12:45
  • My problem was in understanding how can a rust function after it's compiled to assembly return a value/a struct. As far as I know x86 calling convention says return values are stored in EAX register. Therefore return values from functions can only be 64 bit (on 64 bit processor). Therefore I was confused what is actually returned when a struct is returned because it has to be on the heap and therefore only a pointer to the heap can be returned... That's why I thought of Box<>.. Hope this clarifies it a bit better why I was confused.. :) – phip1611 Feb 09 '20 at 12:50
  • 5
    you can return structure in C [too](https://stackoverflow.com/questions/9653072/return-a-struct-from-a-function-in-c). How to do it is the compiler job, there not really a "x86 calling convention" x86 is an assembly language, you can do various thing in it, and if you want don't use EAX register as classic return value. You can reserve a stack place, use more then one register, use a bigger register, do whatever is allowed in this system. – Stargateur Feb 09 '20 at 12:53

3 Answers3

5

No.

Rust doesn't hide anything from you at all.

If you write that you return String, you will return a String, nothing more nor less. String as it is, is a plain stack variable that will just be copied around to where it needs to be.

You're right about the need of allocated data, though.

This is an implementation detail. To understand where allocation happens, we must look at what a String actually is.

Depending on how you look at it, it might be unsurprising that it allocates internally, and for that specific reason, Vec is used, as it abstracts allocation of raw data.

This is literal definition of a String:

pub struct String {
    vec: Vec<u8>,
}

When you create a new String, as you can see, there's no Box.

Vec is created, and the Vec is responsible for allocating and deallocating the data.

String on the other hand, is responsible for abstracting raw data into exactly what we do understand as a typical string.

  • https://doc.rust-lang.org/std/string/struct.String.html#representation is clear about how string work, string type in Rust have a lot of guarantee, that not an implemented detail on this case. – Stargateur Feb 09 '20 at 12:40
4

String manages it's own char buffer. It's more like C++'s std::string than char*. And you definitely can return local variables, which get copied or moved as appropriate, why wouldn't you be able to?

yuri kilochek
  • 12,709
  • 2
  • 32
  • 59
  • But how can you return local variables? The adress of a local variable/stack variable is invalid when the function returns because the stack pointer is resetted to its previous value (from the upper function that called the function). Return should only work for small values (register width)? – phip1611 Feb 09 '20 at 10:14
  • 2
    @phip1611 When you return a `String`, you aren't returning a reference to a stack value; it's a pointer to the heap (and a length and capacity in this case). – SCappella Feb 09 '20 at 10:17
  • Ah okay, got it. Therefore only a 64 bit value/address (word-width / register-width on x68_64 for example) is returned in this case? – phip1611 Feb 09 '20 at 10:18
  • 1
    @phip1611 If you want to be very precise about it, current implementation at the very least needs a: pointer to data; capacity of data that is pointed to; amount of data already in use. It is that way, because that's what Rust `Vec` consists of in current implementation. These values exist on a stack unless you wrap them in a `Box` yourself. –  Feb 09 '20 at 10:29
4

Return values are usually passed back to the caller through a register or two. On x86-64 (which has 64-bit registers), rax is used, and specifically for the System V AMD64 ABI, rdx can complement rax for return values of up to 128 bits (16 bytes).

However, a String is larger than that: on x86-64, it takes 24 bytes. When the return type of a function is too large to pass through registers, the function will receive an additional parameter which will contain the address where the return value should be stored. Often, this pointer will refer to a local variable in the caller's stack frame.

Francis Gagné
  • 60,274
  • 7
  • 180
  • 155
  • Thank you very much! That was what I wanted to know actually. I forgot about the fact that functions can return values larger than one register.. now it's clear to me! I mistakenly thought that the metadata of a String (meta data about the underlying vector) also has to lay somehow on the heap when it's returned from a function – phip1611 Feb 11 '20 at 09:09