What does it mean to be generic over a closure type in Rust?

Question

What does it mean when a type parameter is bounded by an Fn type?

fn call<F: Fn() -> u64>(f: F) -> u64 {
    f()
}

The Rust Book says that functions with type parameters are monomorphized: https://doc.rust-lang.org/stable/book/ch10-01-syntax.html?highlight=generic%20function#performance-of-code-using-generics.

If that's the case, then I'd expect a new version of call to be generated for each new closure type it is instantiated with.

According to the Rust Reference, each closure type is distinct, even if the function body is the same: https://doc.rust-lang.org/reference/types/closure.html.

So when I look at the compiled output for this program, I'd expect the compilation artifact to be larger when I call call with 1799999999999999999 distinct instantiations of F as compared to just four instantiations, but that's not what happens.

(1)

// `call` is defined above

fn make_closure(n: u64) -> impl Fn() -> u64 {
   move || n 
}

fn main() {
    let results = (0..17999999999999999999).map(make_closure).map(call);
    for r in results {
        println!("{}", r)
    }
}

So what's the right mental model for what fn call<F: Fn() -> u64> means? Should I think of the lack of code bloat as merely an optimization?

It gets harder for me to sustain this mental model when the closure is built from user input, though:

(2)

fn main() {
    let mut buf = String::new();
    std::io::stdin().read_line(&mut buf).unwrap();
    let n = buf.trim().parse::<u64>().unwrap();
    println!("{}", call(make_closure(n)));
}

So what's the right way to think about what the signature of call means?

Update

Adding more information so we can reference it from comments:

(3)

The Rust Reference says:

A closure expression produces a closure value with a unique, anonymous type that cannot be written out. A closure type is approximately equivalent to a struct which contains the captured variables.

(4) The first line below is accepted by rustc, the second is not:

    vec![make_closure(1), make_closure(2)]; // OK
    vec![(|| 1), (|| 1)];                   // Error: mismatched types

rodrigo · Answer 1 · 2021-12-28T02:53:55.087

8

I think you are mixing concepts here. Indeed, every function in Rust has its own type, and so does every closure, even if they have the same body. So in this example:

fn id0(x: u64) -> u64 { x }
fn id1(x: u64) -> u64 { x }
fn main() {
    let id2 = || 1;
    let id3 = || 1;
}

every id0, id1, id2 and id3 have distinct types, all of them implement the Fn() -> u64 trait. If you write call(id0) or call(id1) you will get monomorphized versions of that function that call id0 or id1 statically and directly, without doing any indirect function pointer call.

And since they have an empty capture set, they can also be coerced into the non-generic function type: fn() -> u64. But if you force that, then you lose that monomorphization: call(id0 as fn() -> u64) and call(id1 as fn() -> u64) both call to the same instantation of the call() function, that calls the inner closure indirectly using the given pointer.

Then, in your example:

fn make_closure(n: u64) -> impl Fn() -> u64 {
   move || n 
}

there is only one closure. You can call this function with any number, and that number is captured by the closure, but that does not alter the type of the returned value. That type is determined at compilation time, always the same.

A different thing would be if you use a constant generic argument:

fn make_closure_gen<const N: u64>() -> impl Fn() -> u64 {
   || N
}

Now for every N you have an instantiation of that generic function, and each one with a different return type. But you cannot create a runtime loop that instantiates 1000000 of these functions, because N must be constant. (You can try a compile-time loop, however).

Anyway, even if you manage to create millions of instantiations of the same function, if the compiler detects that the inner code of several functions is the same it can merge them into a single one, (sometimes call code deduplication), so that the final executable does not blow up. The details vary from version to version of the compiler, and it can even depend on the linker used. It can even apply to non-generic functions!

edited Dec 28 '21 at 02:53

answered Dec 27 '21 at 14:49

rodrigo

94,151
12
143
190

"`id0`, `id1` .. have different types" - do you have a reference for this? I would think that while `id2` and `id3` have distinct types (as they're closures), `id0` and `id1` both have the type `fn(u64) -> u64`. – apilat Dec 27 '21 at 16:56
2

@apilat: Yes, I would think that aswell, but think of it: if they have different types, and each implement the trait `Fn() -> u64` by calling itself, they are actually zero sized (ZST) and their call can be monomorphized. Win-win. About the reference, take a look at [the reference](https://doc.rust-lang.org/reference/types/function-item.html) or this [demonstration code](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1320cbd7be1c8a45b21b76ea736c7b71). – rodrigo Dec 27 '21 at 17:46
You wrote "Indeed, every function in Rust has its own type, and so does every instance of a closure, even if they have the same body". But @Aplet123 wrote " closure types can ... be inhabited by multiple values". Are these two views compatible? – Max Heiber Dec 27 '21 at 21:46
Thanks for your answer, could you summarize? Is the answer that "semantically, this is monomorphization, but as an optimization the compiler may do something else"? – Max Heiber Dec 27 '21 at 21:49
1

@MaxHeiber: My choice of words in "every instance of a closure" is wrong (changed now). What I meant is "every closure as it is found in code or instantiated from a generic". If you use the same closure several times with different captured values, it is still the same _type_ even if they are different _instances_, so Aplet123 is correct. – rodrigo Dec 28 '21 at 02:56
1

@MaxHeiber: Well, that is a good summary of my answer. But if you ask why your 17999999999999999999 loop does not blow up the size of the program, that is simply because your `make_closure()` always returns the same type, with the `n` as a captured value, that is a runtime nameless field inside the closure, but there is no huge number of monomorphizations here, just one. – rodrigo Dec 28 '21 at 03:02
Your answer is helpful, but I think parts may be misleading: "every function in Rust has its own type" in your example, `id0` and `id1` have the same type: `fn(u64) -> u64` You gave an example involving pointer types, but I was asking about closure types. How are pointer types relevant? "all of them implement the `Fn() -> u64 trait`": I couldn't find such an impl in std. "And since they have an empty capture set, ..." - but I was asking specifically about closures that *do* move things–irrelevant? "in your example ... there is only one closure" - did you mean "closure type"? – Max Heiber Dec 28 '21 at 12:48
@MaxHeiber "I couldn't find such an impl in std" I believe that the `Fn` trait is a bit of a "magic" trait that's specially treated by the compiler. From [the docs](https://doc.rust-lang.org/std/ops/trait.Fn.html): "`Fn` is implemented automatically by closures which only take immutable references to captured variables or don’t capture anything at all, as well as (safe) function pointers (with some caveats, see their documentation for more details)." – Aplet123 Dec 28 '21 at 13:09
I found `trait Fn` in stdlib but couldn't find the mechanism that allows an `fn` to be used as an `Fn`. Could be implemented via some kind of coercion or by `impl Fn ... for fn` or something else. Interesting question, imo, not related to my original question, so posted here: https://stackoverflow.com/questions/70507880/what-enables-a-function-pointer-type-to-be-used-where-an-fn-type-is-expeced – Max Heiber Dec 28 '21 at 13:39
@MaxHeiber: `id0` and `id1` do not have the same type, as my [demonstration (updated)](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=65e67e2c1667824bd198391c6d9816b5): they have type `fn(u64)->u64 {id0}` and `fn(u64)->u64 {id1}` (made up names for diagnostic). But both implement `Fn(u64)->u64` and both can be coerced to `fn(u64)->u64`. And the same is true for closures, except that they can be coerced to `fn()` if and only if their capture set is empty. – rodrigo Dec 28 '21 at 14:06

score 6 · Answer 2 · answered Dec 27 '21 at 14:49

6

Just like normal types, closure types can also be inhabited by multiple values. move || n has the same type, regardless of the value of n (the closure type is determined by the body of the closure, not what the captured values are). So, your 17999999999999999999 closures are all of the same type despite having a different captured value of n, thus only needing one instance of call. If you actually have 2 different closures like in the following code:

fn make_closure(n: u64) -> impl Fn() -> u64 {
   move || n 
}

fn make_closure2(n: u64) -> impl Fn() -> u64 {
   move || n + 1
}

fn call<F: Fn() -> u64>(f: F) -> u64 {
    f()
}

pub fn main() {
    call(make_closure(5));
    call(make_closure2(5));
}

You can see two different versions of example::call in the decompilation.

answered Dec 27 '21 at 14:49

Aplet123

33,825
1
29
55

This is helpful, but I'm still a bit confused about identity. When you wrote: "`move || n` has the same type, regadless of the value of `n`" were you referring to the span of source code? Is the right way of thinking about closure types that they are tied to spans of source code? That would explain why `let _ = vec![(|| 1), (|| 1)];` is rejected but `let _ = vec![make_closure(1), make_closure(2)]` is accepted. – Max Heiber Dec 27 '21 at 22:33
1

The span of the source code is a good way of thinking about it. – Aplet123 Dec 28 '21 at 00:05
"Just like normal types, closure types can also be inhabited by multiple values": I now understand what you were saying, but as written this is probably not correct: not all types have multiple inhabitant. For example,`!` has zero inhabitants and `()` has one inhabitant. Would you be OK with me editing the wording before marking as accepted? – Max Heiber Dec 28 '21 at 12:35
1

@MaxHeiber I'd argue that "can be inhabited by multiple values" doesn't mean "is inhabited by multiple values", but feel free to edit it if it makes it less confusing for you. – Aplet123 Dec 28 '21 at 13:07
``` fn call u64 + 'static>(f: F) -> u64 { println!("{:?}", TypeId::of::()); f() } ``` Any idea why the type id for each closure generated by `make_closure` is distinct? https://doc.rust-lang.org/std/any/struct.TypeId.html: "A TypeId represents a globally unique identifier for a type." – Max Heiber Dec 30 '21 at 10:14

What does it mean to be generic over a closure type in Rust?

Update

2 Answers2