-1

Problem : I have a utility function which takes in a generic list to remove duplicates, Now when I use it for List<String> the match should case insensitive. The code uses streams (Java 8+) and i want to keep it that way.

Note : code is in JAVA 8+

Code :

public static <T> List<T> removeDuplicates(List<T> inputList) {
    List<T> result = null;
    if (inputList != null && inputList.size() > 0) {
        result = inputList.parallelStream().distinct().collect(Collectors.toList());
    }
    return result;
}

EG:

List<String> inputList = new ArrayList<String>();
inputList.add("a");
inputList.add("A");
inputList.add("abc");
inputList.add("ABC");
inputList.add("c");

When we call removeDuplicates(inputList) and print it

Values:

 a
 abc
 c

I don't really care if it choose ABC over abc or A over a but it should be there only once.

Is there an elegant way of solving this issue without doing an instanceof check ?

StackFlowed
  • 6,664
  • 1
  • 29
  • 45
  • 6
    Why don't you just convert the `List` to a `Set`? – Matthew Diana Sep 13 '16 at 20:30
  • 3
    Why make method generic if you only handle `String`s? – Jezor Sep 13 '16 at 20:30
  • 2
    @MatthewDiana it still wont solve my problem. Set remove duplicates but the don't check for case insensitive part ... – StackFlowed Sep 13 '16 at 20:31
  • 1
    What if `T` is `Integer` - what does "case insensitive" mean then? – Bohemian Sep 13 '16 at 20:32
  • @Jezor this is used by multiple other objects and not just String. – StackFlowed Sep 13 '16 at 20:32
  • @Bohemian that is the problem ... is there a way to ignore the case insensitive part if it is not a string ? – StackFlowed Sep 13 '16 at 20:33
  • It seems like the `String` class gets treated specially compared to other objects when using your method, so using `instanceof` might be the way to go. – Matthew Diana Sep 13 '16 at 20:35
  • Try converting it to Set and convert it back to List again. Because Sets can only contain unique elements. – Young Emil Sep 13 '16 at 20:36
  • If you know the type of the list at compile time, just make a separate method. If you don't, there's no way to find out, thanks to type erasure. – shmosel Sep 13 '16 at 20:36
  • 3
    You can use a `new TreeSet(String.CASE_INSENSITIVE_ORDER)` to filter out case-insensitive duplicates. – shmosel Sep 13 '16 at 20:39
  • @shmosel as per you suggestion the only way to do this is have a separate function for String and then do the TreeSet for it. – StackFlowed Sep 13 '16 at 20:46
  • The `TreeSet` is not a requirement, just a suggestion. – shmosel Sep 13 '16 at 20:49
  • I saw this post that might answer your question. They create a custom function that returns a predicate that is passed in to the distinct() method. http://stackoverflow.com/questions/23699371/java-8-distinct-by-property – shong Sep 13 '16 at 20:37

3 Answers3

3

You can extend your method to accept also a function to be applied in map on your stream.
This function will be generic with the same T, so this will solve the need for instanceof. In the String case insensitive example, the function will be String::toLowerCase.

public static <T> List<T> removeDuplicates(List<T> inputList, Function<T,T> function) {
    List<T> result = null;
    if (inputList != null && inputList.size() > 0) {
        result = inputList.parallelStream()
          .map(function)
          .distinct()
          .collect(Collectors.toList());
    }
    return result;
}

And if you want to keep the same API for the types that don't need it, just add this overload:

public static <T> List<T> removeDuplicates(List<T> inputList) {
  return removeDuplicates(inputList, Function.identity());
}
shmosel
  • 49,289
  • 6
  • 73
  • 138
Nir Levy
  • 12,750
  • 3
  • 21
  • 38
3

If the caller knows the type of T at compile time, you can have it pass an optional Comparator<T> to the method, and filter out duplicates using a TreeSet:

public static <T> List<T> removeDuplicates(List<T> inputList) {
    // null uses natural ordering
    return removeDuplicates(inputList, null);
}

public static <T> List<T> removeDuplicates(List<T> inputList, Comparator<? super T> comparator) {
    Set<T> set = new TreeSet<>(comparator);
    set.addAll(inputList);
    return new ArrayList<>(set);
}

public static void main(String[] args) {
    System.out.println(removeDuplicates(Arrays.asList(1, 2, 2, 3)));
    System.out.println(removeDuplicates(Arrays.asList("a", "b", "B", "c"), String.CASE_INSENSITIVE_ORDER));
}

Output:

[1, 2, 3]
[a, b, c]
shmosel
  • 49,289
  • 6
  • 73
  • 138
0

If you want behavior that differs from the default equals behavior you can roll your own String:

import org.apache.commons.lang3.StringUtils;

import java.util.Arrays;
import java.util.stream.Collectors;

public class MyString {
  private final String value;

  public MyString(final String value) {
    this.value = value;
  }

  @Override
  public String toString() {
    return value;
  }

  public String getValue() {
    return value;
  }

  @Override
  public boolean equals(final Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    final MyString myString = (MyString) o;
    return StringUtils.equalsIgnoreCase(myString.value, value);
  }

  @Override
  public int hashCode() {
    return value.toUpperCase().hashCode();
  }

  public static void main(String... args) {
    // args = {aa AA aA bb Bb cc bb CC}
    System.out.println(Arrays.stream(args).map(MyString::new).collect(Collectors.toSet()));
    // prints: [aa, bb, cc]
  }
}
Andreas
  • 4,937
  • 2
  • 25
  • 35