-1

How can I filter my collection with values to be checked for equality ignoring case.

Example: I have

["Value1", "vALue1", "vALue2", "valUE2"]

I need to have

["Value1", "vALue2"]

Any solutions will be good.

For example I can forbid adding new string if I already has one that equals ignoring case

or I can have collection and just filter it to get rid of strings which are equals ignoring case

Ivan
  • 71
  • 8

4 Answers4

2

This seems not to be (easily) possible with Streams alone1, but you can keep track of the already seen elements in a Set (O(1) lookup) and filter elements by whether their lowercased forms are already in that set (Set.add will return false then).

List<String> values = List.of("Value1", "vALue1", "vALue2", "valUE2");
Set<String> seen = new HashSet<>();
List<String> res = values.stream().filter(s -> seen.add(s.toLowerCase()))
                                  .collect(Collectors.toList());
System.out.println(res);  // [Value1, vALue2]

1) E.g., distinct does not accept a mapping function and Collectors.groupingBy might does not preserve order.

tobias_k
  • 81,265
  • 12
  • 120
  • 179
  • 1
    why not use `seen = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);` and remove the expensive `toLowerCase()` call… – Holger Feb 10 '21 at 09:59
2

Some Java libraries which provide distinctBy functionality may be used to resolve this task.

For example, StreamEx library (GitHub, Maven Repo), which stands for Extenstion of Stream API, may be used like this:

import java.util.*;
import one.util.streamex.*;

public class MyClass {
    public static void main(String args[]) {
        String[] data = {
            "Value1", "vALue1", "vALue2", "valUE2"
        };
        List<String> noDups = StreamEx.of(data)
                .distinct(String::toLowerCase)
                .toList();
        System.out.println(noDups);
    }
}

Output:

[Value1, vALue2]
Nowhere Man
  • 19,170
  • 9
  • 17
  • 42
0

Here is a minimal code example as of how this could work:

String[] array = new String[] {"Value1", "vALue1", "vALue2", "valUE2"};
ArrayList<String> finalArray = new ArrayList<>();
for(String entry : array) {
    boolean alreadyContained = finalArray.stream().anyMatch(entry::equalsIgnoreCase);
    if(!alreadyContained) {
        finalArray.add(entry);
    }
}

Basically, you create an ArrayList of all non-duplicate entries. For each entry, check if it already is contained in the ArrayList (ignoring case), add it otherwise.

MonsterDruide1
  • 347
  • 3
  • 15
0

Edit

Here is a genericized example of tobias_k's response:

import java.util.*;
import java.util.stream.Collectors;

public class ArrayUtils {
    public static void main(String[] args) {
        List<String> values = List.of("Value1", "vALue1", "vALue2", "valUE2");
        List<String> deduped = dedupeCaseInsensitive(values);

        System.out.println(deduped);  // [Value1, vALue2]
    }

    /* Higher-order function */
    public static List<String> dedupeCaseInsensitive(List<String> collection) {
        return dedupeWith(collection, String.CASE_INSENSITIVE_ORDER);
    }

    public static <E> List<E> dedupeWith(List<E> list, Comparator<E> comparator) {
        Set<E> seen = new TreeSet<>(comparator);
        return list.stream().filter(s -> seen.add(s)).collect(Collectors.toList());
    }
}

Original edit

Here is a stream version:

import java.util.*;
import java.util.function.Function;
import java.util.stream.Collectors;

public class ArrayUtils {
    public static void main(String[] args) {
        String[] items = {"Value1", "vALue1", "vALue2", "valUE2"};
        String[] result = dedupeCaseInsensitive(items);

        // Print the resulting array.
        System.out.println(Arrays.toString(result));
    }

    public static String[] dedupeCaseInsensitive(String[] items) {
        return Arrays.stream(items)
            .collect(Collectors.toMap(
                String::toLowerCase,
                Function.identity(),
                (o1, o2) -> o1,
                LinkedHashMap::new))
            .values()
            .stream()
            .toArray(String[]::new);
    }
}

Original response

You can dedupe with case-insensitive logic by populating a Map, grabbing its values and sorting them.

import java.util.*;

public class ArrayUtils {
    public static void main(String[] args) {
        String[] items = {"Value1", "vALue1", "vALue2", "valUE2"};
        String[] result = dedupeCaseInsensitive(items);

        // Print the resulting array.
        System.out.println(Arrays.toString(result));
    }

    public static String[] dedupeCaseInsensitive(String[] items) {
        Map<String, String> map = new HashMap<String, String>();

        // Filter the values using a map of key being the transformation,
        // and the value being the original value.
        for (String item : items) {
            map.putIfAbsent(item.toLowerCase(), item);
        }

        List<String> filtered = new ArrayList<>(map.values());

        // Sort the filtered values by the original positions.
        Collections.sort(filtered,
                Comparator.comparingInt(str -> findIndex(items, str)));

        return collectionToArray(filtered);
    }

    /* Convenience methods */

    public static String[] collectionToArray(Collection<String> collection) {
        return collection.toArray(new String[collection.size()]);
    }

    public static int findIndex(String arr[], String t) {
        return Arrays.binarySearch(arr, t);
    }
}

If you use a LinkedHashMap, you do not need to sort, because the items retain their insertion order.

import java.util.*;

public class ArrayUtils {
    public static void main(String[] args) {
        String[] items = {"Value1", "vALue1", "vALue2", "valUE2"};
        String[] result = dedupeCaseInsensitive(items);

        // Print the resulting array.
        System.out.println(Arrays.toString(result));
    }

    public static String[] dedupeCaseInsensitive(String[] items) {
        Map<String, String> map = new LinkedHashMap<String, String>();

        // Filter the values using a map of key being the transformation,
        // and the value being the original value.
        for (String item : items) {
            map.putIfAbsent(item.toLowerCase(), item);
        }

        return collectionToArray(map.values());
    }

    /* Convenience methods */

    public static String[] collectionToArray(Collection<String> collection) {
        return collection.toArray(new String[collection.size()]);
    }
}}
Mr. Polywhirl
  • 42,981
  • 12
  • 84
  • 132
  • 1
    You could still accept `Collection` as input, making the method usable for more cases. By the way, `.filter(s -> seen.add(s))` can also be written as `.filter(seen::add)`. In fact, since the part to the left of the `::` is only evaluated once and the result captured, you can even write `.filter(new TreeSet<>(comparator)::add)` and it will do the intended thing. – Holger Feb 11 '21 at 07:20