6

While reading about Rust, I ran into an example function that takes a number and returns a function that adds that number to another number.

fn higher_order_fn_return<'a>(step_value: &'a i32) -> Box<Fn(i32) -> i32 + 'a> {
    Box::new(move |x: i32| x + step_value)
}

There are so many Rust-specific mechanisms here I can't make sense of it. I'm sure some of it has to do with lifetime management but the reasons why this must be written this way eludes me. A few questions:

  • Why is step_value passed in as a reference?
  • Why is the function being returned boxed?
  • How to interpret the unconventional way to write a function type (as Fn(i32) -> i32 + 'a) ?
  • Why is 'a written as a generic (<'a>) but "added" in the return type (+ 'a) ?
  • What is the meaning of move and what is being moved here?
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366

2 Answers2

8

There's a prohibition against asking more than one question, but since these all fall under "what does this piece of code mean", I won't complain. Also, it does happen to compress quite a bit of weirdness into one relatively small, not terribly unusual snippet.

Why is step_value passed in as a reference?

No idea. It just is. It could be passed by-value without significantly altering the semantics of the code. But it is being passed by reference, and that's the cause of all the other lifetime-related issues.

Why is the function being returned boxed?

It's not returning a function. Functions are defined by fn. It's returning a closure. The problem there is that every closure is actually an instance of an anonymous type (sometimes called a "Voldemort Type") for performance reasons. Anonymous types are a problem because you can't name them, but you have to name your return type.

The way around this is to return a trait object instead. In this case, it's returning a Fn. There is also FnMut and FnOnce. It's returning it boxed because bare trait objects can't be passed around by-value, so trait objects always have to be behind some kind of pointer (be that Box, &, Rc, etc.).

They can't be passed around by-value because the compiler can't work out how big it's going to be, which makes moving them around almost impossible. After that, the train of logic diverts straight into "how the compiler is implemented" territory which is somewhat out of scope here.

How to interpret the unconventional way to write a function type (as Fn(i32) -> i32 + 'a) ?

There's nothing unconventional about it. Not for Rust, anyway, and since this is in Rust, other how other languages do it isn't relevant.

Let's ignore the + 'a for a moment, since that's actually something else. The Fn(i32) -> i32 is the important part. Every "callable" thing in Rust implements one or more of the Fn, FnMut, and FnOnce traits, which is how Rust expresses the idea of being able to call something. The stuff inside the parens are the arguments, the thing after -> is the return type, just like functions.

You can learn more about these traits in the question "When does a closure implement Fn, FnMut and FnOnce?".

Why is 'a written as a generic (<'a>) but "added" in the return type (+ 'a) ?

Firstly, because lifetimes are part of the type system. Hence, they go in the generic parameter list (the thing inside <...>).

Secondly, because the compiler has to understand how long the trait object inside the Box is going to be valid for. If you have Box<SomeTrait>, how long is the compiler allowed to let that value exist? Normally, that information would be part of the type, but if you're using a trait, then the compiler doesn't know which type is being used. Remember, you can make a Box<SomeTrait> out of any Box<T> where T implements SomeTrait.

In this case, the closure is going to hold on to the step_value borrow, meaning it must not outlive the lifetime of that borrow (which is 'a). But if the type was just Box<Fn(i32) -> i32>, the compiler wouldn't have that information. So, there is syntax for specifying that whatever the type hiding behind a trait object is, it cannot outlive a given lifetime.

That's what the + 'a is saying: "this is a boxed value that implements the Fn(i32) -> i32 trait, and it cannot outlive the lifetime 'a".

What is the meaning of the move and what is being moved here?

Normally, the compiler tries to guess what it has to do to make a closure work, but it can't always get it right. Where possible, it tries to borrow things captured by the closure. So when you use step_value inside the closure, the compiler would normally just borrow it.

That wouldn't be an issue, except that you're returning the closure out of the function. This automatic borrow would only last for the lifetime of the function, which isn't long enough. To fix this, instead of borrowing step_value, you can move it into the closure.

Bonus thing you might be wondering.

If you don't write the + 'a in Box<Trait + 'a>, what would normally happen?

Actually, the compiler has a heuristic here. By default, every trait object has an attached lifetime. It's inherited from the pointer that wraps it. So, &'a Trait is really &'a (Trait + 'a). Box doesn't have a lifetime parameter of its own, so it gets 'static (i.e. Box<Trait> is Box<Trait + 'static>), which means that by default, boxed trait objects cannot contain any non-'static borrows.

DK.
  • 55,277
  • 5
  • 189
  • 162
2

Why is step_value passed in as a reference?

There is no good reason for this. Passing it in by value makes everything much easier. However, the example in question might have done it because you can't do that for every type, just those that are Copy.

Why is the function being returned boxed?

The type of a lambda cannot be named, and thus it can't be returned from a function. So you must return a trait object (Fn is a trait) and to do that you need a box. (With impl Trait you won't need the box anymore.)

How to interpret the unconventional way to write a function type (as Fn(i32) -> i32 + 'a) ?

Fn has a bit of syntactic sugar where the syntax Fn(arg1, arg2) -> ret is shorthand for (I think) Fn<(arg1, arg2), Output=ret>. The + above has a lower precedence than the error and isn't part of the Fn constraint; instead it is a constraint combination, meaning that the type in the Box must be both a Fn(i32) -> i32 and have lifetime 'a.

Why is 'a written as a generic (<'a>) but "added" in the return type (+ 'a) ?

Lifetime parameters must be declared in the generic parameter section of the function (or type), thus the <'a>. Then it occurs in the reference type of the argument (& 'a i32), and finally as an additional constraint in the Box.

What is the meaning of move and what is being moved here?

It makes the closure a move closure, which means the things it captures are moved into the closure instead of being captured by reference. In this example, however, note that what is being moved is step_value, which is itself a reference!

Sebastian Redl
  • 69,373
  • 8
  • 123
  • 157