4

I have a list of User objects, defined as follows:

public class User {
    private String userId; // Unique identifier
    private String name;
    private String surname;
    private String otherPersonalInfo;
    private int versionNumber;
    }
    public User(String userId, String name, String surname, String otherPersonalInfo, int version) {
      super();
      this.name = name;
      this.surname = surname;
      this.otherPersonalInfo = otherPersonalInfo;
      this.version = version;
    }
}

Example list:

List<User> users = Arrays.asList(
  new User("JOHNSMITH", "John", "Smith", "Some info",     1),
  new User("JOHNSMITH", "John", "Smith", "Updated info",  2),
  new User("JOHNSMITH", "John", "Smith", "Latest info",   3),
  new User("BOBDOE",    "Bob",  "Doe",   "Personal info", 1),
  new User("BOBDOE",    "Bob",  "Doe",   "Latest info",   2)
);

I need a way to filter this list such that I get only the latest version for each user, i.e:

{"JOHNSMITH", "John", "Smith", "Latest info", 3},
{"BOBDOE", "Bob", "Doe", "Latest info", 2}

What's the best way to achieve this by using Java8 Stream API?

Nick Melis
  • 403
  • 1
  • 7
  • 19

6 Answers6

8

With a little assistance from this answer:

    Collection<User> latestVersions = users.stream()
            .collect(Collectors.groupingBy(User::getUserId,
                    Collectors.collectingAndThen(Collectors.maxBy(Comparator.comparing(User::getVersionNumber)), Optional::get)))
                    .values();

I am assuming the usual getters. Result:

[John Smith Latest info 3, Bob Doe Latest info 2]
Community
  • 1
  • 1
Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
1

I sorted first by version to ensure the newst entry is first in the list. Afterwards I filtered on a distinct key to ensure only one object matching this key is part of the result. For the filtering I needed a predicate which stores a state to filter on things already seen.

The predicate looks like this:

    private static <T> Predicate<T> distinctByKey( Function<? super T, ?> key ) {
    Map<Object, Boolean> seen = new ConcurrentHashMap<>();
    return t -> seen.putIfAbsent( key.apply( t ), Boolean.TRUE ) == null;
}

And then I can use the following Stream:

users.stream().sorted( ( u1, u2 ) -> u2.versionNumber - u1.versionNumber )
              .filter( distinctByKey( u -> u.name + u.surname ) )
              .collect( Collectors.toList() );

There are some other nice solutions to do a distinct base on a key which can be found at Java 8 Distinct by property.

Community
  • 1
  • 1
mszalbach
  • 10,612
  • 1
  • 41
  • 53
1
HashMap<String, User> map = users.stream().collect(Collectors.toMap(User::getUserId, 
            e -> e, 
            (left, right) -> {return left.getVersion() > right.getVersion() ? left : right;}, 
            HashMap::new));
System.out.println(map.values());

Above code prints:

[User [userId=BOBDOE, name=Bob, surname=Doe, otherPersonalInfo=Latest info, version=2], User [userId=JOHNSMITH, name=John, surname=Smith, otherPersonalInfo=Latest info, version=3]]

Explanation: toMap method takes 4 arguments:

  1. keyMapper a mapping function to produce keys
  2. valueMapper a mapping function to produce values
  3. mergeFunction a merge function, used to resolve collisions between values associated with the same key, as supplied to Map.merge(Object, Object, BiFunction)
  4. mapSupplier a function which returns a new, empty Map into which the results will be inserted

  1. First arg is User::getUserId() to get key.
  2. Second arg is a function that returns the User object as it is.
  3. Third arg is a function which solves collision by comparing and keeping the User with latest version.
  4. Fourth arg is the "new" method of HashMap.
rohit
  • 96
  • 1
  • 6
0

This will be painful, but it can be done with some aggregation, in the Java 8 Streams framework:

// create a Map from user name to users, sorted by version
Map<String, NavigableSet<User>> grouped =
        users.stream()
             .collect(
                     Collectors.groupingBy(
                             u -> u.name + "," + u.surname,
                             HashMap::new,
                             Collectors.toCollection(
                                     () -> new TreeSet<>(
                                             Comparator.comparing(
                                                   User::getVersionNumber)))));

// retrieve the latest versions from the Map
List<User> latestVersions = grouped.entrySet()
                                   .stream()
                                   .map(e -> e.getValue().last())
                                   .collect(Collectors.toList());

Given how verbose this is, I'd probably settle for an imperative solution though.

  • Keep a Map<String, User>
  • For every User, check whether the Map already contains the User's string representation
  • If it doesn't, or the User mapped to it has a lower version number, store the User in the Map.
Sean Patrick Floyd
  • 292,901
  • 67
  • 465
  • 588
  • Would it be useful if the User class had an ID field (rather than having to group by name + surname) ? – Nick Melis Mar 08 '17 at 12:42
  • @NickMelis yes, definitely. That would eliminate false positives (the John Smith case) – Sean Patrick Floyd Mar 08 '17 at 12:43
  • 2
    You can just use `Collector.toCollection(() -> new TreeSet<>( Comparator.comparing( User::getVersionNumber))` instead of `Collector.>of( () -> new TreeSet<>(Comparator.comparing(User::getVersionNumber)), Set::add, (left, right) -> { left.addAll(right); return left; })` – Holger Mar 08 '17 at 12:58
0

In java 8 you can create a comparator in the form of a lambda expression.

call users.stream().sorted passing in the comparator.

Example:

 Comparator<User > byVersionNumber = (u1, u2) -> Integer.compare(
            u1.getversionNumber(), u2.getversionNumber());

    users.stream().sorted(byVersionNumber)
            .forEach(u -> System.out.println(u));

Please check for syntax its rough

VedantK
  • 9,728
  • 7
  • 66
  • 71
0
 List<User> users = Arrays.asList(
                new User("JOHNSMITH", "John", "Smith", "Some info", 1),
                new User("JOHNSMITH", "John", "Smith", "Updated info", 2),
                new User("JOHNSMITH", "John", "Smith", "Latest info", 3),
                new User("BOBDOE", "Bob", "Doe", "Personal info", 1),
                new User("BOBDOE", "Bob", "Doe", "Latest info", 2)
        ).stream()
                .collect(Collectors.collectingAndThen(
                        Collectors.toMap(
                                User::getUserId,     //The user's unique property
                                Function.identity(), //Function<User, User>
                                BinaryOperator.maxBy(Comparator.comparing(User::getVersionNumber))

                        ),
                        map -> (List)map.values() 
                ));