2

Assume that I have a list (GS Collections) of Customers with two properties: id and age, I want apply a distinct filter to the property id

 FastList<Customer> customers ...
 customers.distinct( ) //Hey here there's no overload to provide a custom comparator!
Craig P. Motlin
  • 26,452
  • 17
  • 99
  • 126
Nestor Hernandez Loli
  • 1,412
  • 3
  • 21
  • 26

1 Answers1

3

The difference between distinct() and toSet() is that distinct() will preserve order from the original list, but both rely on the default object equality using equals() and hashCode().

The method toSortedSet() takes a Comparator, and toSortedSetBy() allows you to just pass in a Function. Both should work for you. Here's how toSortedSetBy() looks using Java 8.

FastList<Customer> customers = ...;
MutableSortedSet<Customer> sortedSet = customers.toSortedSetBy(Customer::getId);

There are two drawbacks of using a SortedSetIterable. The first is that the algorithm is O(n log n) instead of O(n). The second is that the SortedSetIterable will behave strangely if its equals method is inconsistent with the comparator (Customer.equals() doesn't consider two Customers equals even if they have the same id).

The second approach is to use a UnifiedSetWithHashingStrategy.

FastList<Customer> customers = FastList.newListWith(c1, c2, c3);
UnifiedSetWithHashingStrategy<Customer> set =
  new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromIntFunction(Customer::getId));
set.addAll(customers);

This runs in O(n) time, but you lose ordering. GS Collections doesn't have a form of distinct() that takes a HashingStrategy, but you could write it on your own.

public static <T> FastList<T> distinct(
        FastList<T> fastList, 
        HashingStrategy<T> hashingStrategy)
{
  MutableSet<T> seenSoFar = 
          UnifiedSetWithHashingStrategy.newSet(hashingStrategy);
  FastList<T> targetCollection = FastList.newList();
  for (int i = 0; i < fastList.size(); i++)
  {
    if (seenSoFar.add(fastList.get(i)))
    {
      targetCollection.add(fastList.get(i));
    }
  }
  return targetCollection;
}

And you use it like this.

distinct(customers, HashingStrategies.fromIntFunction(Customer::getId));
Craig P. Motlin
  • 26,452
  • 17
  • 99
  • 126
  • 1
    Good answer!, but I still believe that an overloaded method distinct is missing, one that accepts a Comparator or a HashingStrategy – Nestor Hernandez Loli May 15 '14 at 15:10
  • 1
    You're correct. The example that I wrote that takes a HashingStrategy as a parameter could be generalized and put on ListIterable as an overload, and I think we should consider doing so in a future version of GS Collections. A version that takes a Comparator could be confusing, so I'd want to be more careful. – Craig P. Motlin May 15 '14 at 15:56