2

A simplified example of what I am trying to do:

Suppose I have a list of strings, which need to be grouped into 4 groups according to a condition if a specific substring is contained or not. If a string contains Foo it should fall in the group FOO, if it contains Bar it should fall in the group BAR, if it contains both it should appear in both groups.

List<String> strings = List.of("Foo", "FooBar", "FooBarBaz", "XXX");

A naive approach for the above input doesn't work as expected since the string is grouped into the first matching group:

Map<String,List<String>> result1 =
strings.stream()
        .collect(Collectors.groupingBy(
                        str -> str.contains("Foo") ? "FOO" :
                                    str.contains("Bar") ? "BAR" :
                                            str.contains("Baz") ? "BAZ" : "DEFAULT"));

result1 is

{FOO=[Foo, FooBar, FooBarBaz], DEFAULT=[XXX]}

where as the desired result should be

{FOO=[Foo, FooBar, FooBarBaz], BAR=[FooBar, FooBarBaz], BAZ=[FooBarBaz], DEFAULT=[XXX]}

After searching for a while I found another approach, which comes near to my desired result, but not quite fully

Map<String,List<String>> result2 =
List.of("Foo", "Bar", "Baz", "Default").stream()
        .flatMap(str -> strings.stream().filter(s -> s.contains(str)).map(s -> new String[]{str.toUpperCase(), s}))
        .collect(Collectors.groupingBy(arr -> arr[0], Collectors.mapping(arr -> arr[1], Collectors.toList())));

System.out.println(result2);

result2 is

{BAR=[FooBar, FooBarBaz], FOO=[Foo, FooBar, FooBarBaz], BAZ=[FooBarBaz]}

while this correctly groups strings containing the substrings into the needed groups, the strings which doesn't contain the substrings and therefore should fall in the default group are ignored. The desired result is as already mentioned above (order doesn't matter)

{BAR=[FooBar, FooBarBaz], FOO=[Foo, FooBar, FooBarBaz], BAZ=[FooBarBaz], DEFAULT=[XXX]}

For now I'm using both result maps and doing an extra:

result2.put("DEFAULT", result1.get("DEFAULT"));

Can the above be done in one step? Is there a better approach better than what I have above?

Alexander Ivanchenko
  • 25,667
  • 5
  • 22
  • 46
bbKing
  • 179
  • 1
  • 8

2 Answers2

5

Instead of operating with strings "Foo", "Bar", etc. and their corresponding uppercase versions, it would be more convenient and cleaner to define an enum.

Let's call it Keys:

public enum Keys {
    FOO("Foo"), BAR("Bar"), BAZ("Baz"), DEFAULT("");
    
    private static final Set<Keys> nonDefaultKeys = EnumSet.range(FOO, BAZ); // Set of enum constants (not includes DEFAULT), needed to avoid creating EnumSet or array of constants via `values()` at every invocation of getKeys()
    private String keyName;
    
    Keys(String keyName) {
        this.keyName = keyName;
    }
    
    public static List<String> getKeys(String str) {
        List<String> keys = nonDefaultKeys.stream()
            .filter(key -> str.contains(key.keyName))
            .map(Enum::name)
            .toList();

        // if non-default keys not found, i.e. keys.isEmpty() - return the DEFAULT
        return keys.isEmpty() ? List.of(DEFAULT.name()) : keys;
    }
}

It has a method getKeys(String) which expects a string and returns a list of keys to which the given string should be mapped.

By using the functionality encapsulated in the Keys enum we can create a map of strings split into groups which correspond to the names of Keys-constants by using collect(supplier,accumulator,combiner).

main()

public static void main(String[] args) {
    List<String> strings = List.of("Foo", "FooBar", "FooBarBaz", "XXX");

    Map<String, List<String>> stringsByGroup = strings.stream()
        .collect(
            HashMap::new, // mutable container - which will contain results of mutable reduction
            (Map<String, List<String>> map, String next) -> Keys.getKeys(next)
                .forEach(key -> map.computeIfAbsent(key, k -> new ArrayList<>()).add(next)), // accumulator function - defines how to store stream elements into the container
            (left, right) -> right.forEach((k, v) ->
                left.merge(k, v, (oldV, newV) -> { oldV.addAll(newV); return oldV; }) // combiner function - defines how to merge container while executing the stream in parallel
        ));
    
    stringsByGroup.forEach((k, v) -> System.out.println(k + " -> " + v));
}

Output:

BAR -> [FooBar, FooBarBaz]
FOO -> [Foo, FooBar, FooBarBaz]
BAZ -> [FooBarBaz]
DEFAULT -> [XXX]

A link to Online Demo

Alexander Ivanchenko
  • 25,667
  • 5
  • 22
  • 46
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/246955/discussion-on-answer-by-alexander-ivanchenko-group-strings-into-multiple-groups). – Dharman Aug 01 '22 at 23:01
5

This is ideal for using mapMulti. MapMulti takes a BiConsumer of the streamed value and a consumer. The consumer is used to simply place something back on the stream. This was added to Java since flatMaps can incur undesirable overhead.

This works by can building a String array as you did before of Token and the containing String and collecting (also as you did before). If the key was found in the string, accept a String array with it and the containing string. Otherwise, accept a String array with the default key and the string.

List<String> strings =
        List.of("Foo", "FooBar", "FooBarBaz", "XXX", "YYY");
Map<String, List<String>> result = strings.stream()
        .<String[]>mapMulti((str, consumer) -> {

            boolean found = false;
            String temp = str.toUpperCase();
            for (String token : List.of("FOO", "BAR",
                    "BAZ")) {
                if (temp.contains(token)) {
                    consumer.accept(
                            new String[] { token, str });
                    found = true;
                }
            }
            if (!found) {
                consumer.accept(
                        new String[] { "DEFAULT", str });
            }
        })
        .collect(Collectors.groupingBy(arr -> arr[0],
                Collectors.mapping(arr -> arr[1],
                        Collectors.toList())));

result.entrySet().forEach(System.out::println);

prints

BAR=[FooBar, FooBarBaz]
FOO=[Foo, FooBar, FooBarBaz]
BAZ=[FooBarBaz]
DEFAULT=[XXX, YYY]

Keep in mind that streams are meant to make your coding world easier. But sometimes, a regular loop using some Java 8 constructs is all that is needed. Outside of an academic exercise, I would probably do the task like so.

Map<String,List<String>> result2 = new HashMap<>();

for (String str : strings) {
     boolean added = false;
     String temp = str.toUpperCase();
     for (String token : List.of("FOO","BAR","BAZ")) {
         if(temp.contains(token)) {
             result2.computeIfAbsent(token, v->new ArrayList<>()).add(str);
             added = true;
         }
     }
     if (!added) {
         result2.computeIfAbsent("DEFAULT", v-> new ArrayList<>()).add(str);
     }
}
WJS
  • 36,363
  • 4
  • 24
  • 39
  • 1
    Thank you very much. I like this answer, even if I have never used `mapMulti` and have to update from Java 13 to 16 I think I get a feeling how it is used. I need to try it on my original task and come back to this post soon. – bbKing Jul 27 '22 at 15:45