4

Considering that I have a list of Person objects like this :

Class Person {
  String fullName;
  String occupation;
  String hobby;
  int salary;
}

Using java8 streams, how can I get list of duplicated objects only by fullName and occupation property?

kkot
  • 477
  • 2
  • 6
  • 13
  • add equals and hashCode in person class? – SMA Nov 15 '18 at 07:19
  • 2
    Have you checked out [Java 8 Distinct by property](https://stackoverflow.com/q/23699371) ? – Jorn Vernee Nov 15 '18 at 07:22
  • 1
    Well yes. I don't need to distinct duplicates, rather I need to find if they were any duplicates in fullName - occupation pair, which has to be unique. I found plenty of topics where duplicated were *removed*. I need to add them. – kkot Nov 15 '18 at 07:25

4 Answers4

8

By using java-8 Stream() and Collectors.groupingBy() on firstname and occupation

List<Person> duplicates = list.stream()
    .collect(Collectors.groupingBy(p -> p.getFullName() + "-" + p.getOccupation(), Collectors.toList()))
    .values()
    .stream()
    .filter(i -> i.size() > 1)
    .flatMap(j -> j.stream())
    .collect(Collectors.toList());
diogo
  • 3,769
  • 1
  • 24
  • 30
Ryuzaki L
  • 37,302
  • 12
  • 68
  • 98
5

I need to find if they were any duplicates in fullName - occupation pair, which has to be unique

Based on this comment it seems that you don't really care about which Person objects were duplicated, just that there were any.

In that case you can use a stateful anyMatch:

Collection<Person> input = new ArrayList<>();

Set<List<String>> seen = new HashSet<>();
boolean hasDupes = input.stream()
                        .anyMatch(p -> !seen.add(List.of(p.fullName, p.occupation)));

You can use a List as a 'key' for a set which contains the fullName + occupation combinations that you've already seen. If this combination is seen again you immediately return true, otherwise you finish iterating the elements and return false.

Jorn Vernee
  • 31,735
  • 4
  • 76
  • 93
3

I offer solution with O(n) complexity. I offer to use Map to group given list by key (fullName + occupation) and then retrieve duplicates.

public static List<Person> getDuplicates(List<Person> persons, Function<Person, String> classifier) {
    Map<String, List<Person>> map = persons.stream()
                                           .collect(Collectors.groupingBy(classifier, Collectors.mapping(Function.identity(), Collectors.toList())));

    return map.values().stream()
              .filter(personList -> personList.size() > 1)
              .flatMap(List::stream)
              .collect(Collectors.toList());
}

Client code:

List<Person> persons = Collections.emptyList();
List<Person> duplicates = getDuplicates(persons, person -> person.fullName + ':' + person.occupation);
Oleg Cherednik
  • 17,377
  • 4
  • 21
  • 35
  • 1
    Alternatively, you may combine the `filter` and `flatMap` step: `.flatMap(list -> list.size() > 1? list.stream(): null)`. But you should not force the key type to be a `String`. Creating the key for two properties via string concatenation is inefficient and error prone (might create clashes on the resulting strings). – Holger Nov 15 '18 at 10:29
1

First implement equals and hashCode in your person class and then use.

List<Person> personList = new ArrayList<>();

Set<Person> duplicates=personList.stream().filter(p -> Collections.frequency(personList, p) ==2)
                .collect(Collectors.toSet());

If objects are more than 2 then you use Collections.frequency(personList, p) >1 in filter predicate.

Khalid Shah
  • 3,132
  • 3
  • 20
  • 39