0

I am a newbie in Rust, and I am having trouble understanding Itertools::GroupBy.

Here is an example I came up with to practice GroupBy. Given an array of integers:

  • Group it based on the value of the numbers.
  • Keep only groups with more than 3 elements.
  • Return a vector with the keys of the survived groups.

As an example, given this array

[1,1,1,1,-2,-2,-3,-3,-3,-3,4,4]

I want to return the vector (1,-3), because 1 and 3 appears more than three times in the array.

Here is my attempt

let groups = [1, 1, 1, 1, -2, -2, -3, -3, -3, -3, 4, 4]
  .into_iter()
  .group_by(|element| *element)
  .into_iter()
  .filter(|(_, value)| value.count() > 3)
  .map(|(key, _)| key)
  .collect::<Vec<i32>>();

You can try this out in the Rust playground.

The line with the filter results in the error: "cannot move out of *value which is behind a shared reference", where *value is moved due to the method call count().

After experimenting, I notice that I can fix the code adding a map

let groups = [1, 1, 1, 1, -2, -2, -3, -3, -3, -3, 4, 4]
    .into_iter()
    .group_by(|element| *element)
    .into_iter()
    .map(|(key, value)| (key, value.collect::<Vec<i32>>()))
    .filter(|(_, value)| value.len() > 3)
    .map(|(key, _)| key)
    .collect::<Vec<i32>>();

However, I would like to understand why my original attempt is not working. I understand that it has to do with borrowing, but my knowledge is still too basic.

Lukas
  • 27
  • 7

1 Answers1

3

.filter() doesn't give you ownership of the item, it doesn't even give you mutability, it just gives you immutable references. It expects you to deduce from the immutable input whether something should be kept or removed. The items are just bypassed, .filter() is expected not to modify them, just to select them.

To understand how many values are in your group, though, you have to iterate through it, because the value variable is an iterator. You cannot iterate through it, however, because for obvious reasons that would modify the value variable, which isn't allowed inside of .filter().

There is a solution, though: The .filter_map() method combines .filter() and .map() in a single operation and gives you the element owned and to do with as you please:

use itertools::Itertools;

fn main() {
    let groups = [1, 1, 1, 1, -2, -2, -3, -3, -3, -3, 4, 4]
        .into_iter()
        .group_by(|element| *element)
        .into_iter()
        .filter_map(|(key, value)| (value.count() > 3).then_some(key))
        .collect::<Vec<i32>>();

    println!("{:?}", groups);
}
[1, -3]
Finomnis
  • 18,094
  • 1
  • 20
  • 27
  • 1
    Just to add some justification for these choices: `filter`'s predicate function *can't* take ownership of its argument, as it has to leave it intact (unmoved) so that it can continue to exist in the post-`filter` iterator. `filter_map`'s function *has* to take ownership of its argument, otherwise the owned `T` in the returned `Option` would need to be produced from a reference that exists only for the length of the function call, which is not possible in general without cloning. – BallpointBen Jul 30 '22 at 20:04
  • @BallpointBen yah, but there is nothing intrinsic that would prevent `filter` from taking a mutable reference. But I still think it's a good design choice that it doesn't. – Finomnis Jul 30 '22 at 22:11