3

I have a list of "Articles" that I need to group and get the most recent group. Assume my object to be

private interface Article {
    Date getCreated();
    String getAuthor();
}

I need to find unique names from the most recent group. Here's a workaround I've come up with but I'm unsure about the time-complexity or the caveats.

Set<String> previous = Optional.ofNullable(articles
        .stream()
        .collect(Collectors.groupingBy(Article::getCreated, () -> new TreeMap<>(Comparator.reverseOrder()), Collectors.toList()))
        .firstEntry())
        .map(entries -> entries.getValue()
                .stream()
                .map(event -> event.getAuthor())
                .collect(Collectors.toSet()))
        .orElse(Collections.emptySet());

Is there a less convoluted way to do this?

Naman
  • 27,789
  • 26
  • 218
  • 353
Mridang Agarwalla
  • 43,201
  • 71
  • 221
  • 382
  • A`for ` loop I guess – daniu Jul 02 '20 at 19:57
  • 2
    Why do you use `reverseOrder()` and `firstEntry()` instead of natural order and the `lastEntry()` you’re interested in? That would save you 26 characters. But generally, you have to decide whether you strive for brevity in source code or efficiency at runtime. These are not always the same. See [Q: How to force max to return ALL maximum values in a Java Stream?](https://stackoverflow.com/q/29334404/2711488) – Holger Jul 03 '20 at 09:30
  • @Holger you're right. I think I just got carried away trying to solve the issue. – Mridang Agarwalla Jul 06 '20 at 07:26

3 Answers3

2

Since the only case you're trying to handle with a wrapped Optional is for the empty list of articles in your code currently. You could simplify that with a check such as:

private static Set<String> authorsOfMostRecentArticles(List<Article> articles) {
    if (articles.isEmpty()) return Collections.emptySet();
    return articles.stream()
            .collect(Collectors.groupingBy(Article::getCreated,
                    TreeMap::new, Collectors.toList()))
            .lastEntry().getValue() // slight change
            .stream()
            .map(Article::getAuthor)
            .collect(Collectors.toSet());
}

On the other hand, if you want to ensure additional filters in the logic as well, then, finding the lastEntry and proceeding further would be simpler.

private static Set<String> authorsOfMostRecentArticles(List<Article> articles) {
    Map.Entry<Date, List<Article>> mostRecentEntryOfArticles = articles.stream()
            .filter(java.util.Objects::nonNull) // if articles could be null
            .filter(article -> article.getCreated() != null) // if getCreated could be null as well
            .collect(Collectors.groupingBy(Article::getCreated,
                    TreeMap::new, Collectors.toList()))
            .lastEntry();

    return mostRecentEntryOfArticles == null ? Collections.emptySet() :
            mostRecentEntryOfArticles.getValue().stream()
                    .map(Article::getAuthor)
                    .collect(Collectors.toSet());
}

In short, there is no need of wrapping the code within Optional.of.., that's what the intent of entries being returned as null might have been as well.

Naman
  • 27,789
  • 26
  • 218
  • 353
1

Here's a bit simplified version with Stream API:

    Set<String> previous = Objects.requireNonNullElse(
            articles.stream()
                    .collect(Collectors.groupingBy(
                            Article::getCreated,
                            () -> new TreeMap<>(Comparator.reverseOrder()),
                            Collectors.mapping(Article::getAuthor, Collectors.toSet())))
                    .firstEntry()
                    .getValue(),
            Collections.emptySet());

A for-each loop is probably more readable, even if it uses twice more lines:

    Set<String> previous = new HashSet<>();

    if (!articles.isEmpty()) {
        Date previousDate = articles.get(0).getCreated();

        for (Article article : articles) {
            int cmp = previousDate.compareTo(article.getCreated());

            if (cmp > 0) {
                continue;
            }

            if (cmp < 0) {
                previousDate = article.getCreated();
                previous.clear();
            }

            previous.add(article.getAuthor());
        }
    } 

As for time complexity, (if my math is not terribly wrong) it is: O(n) for the for-each version and O(n*log(n)) for the Stream version. The stream version is expected to be a bit heavier on memory use too.

In practice it is hard to say which one would be faster, but if I have to bet (without testing) I would put my money on the for-each loop.

MartinBG
  • 1,500
  • 13
  • 22
0

I would split it into two operation and do it something like this:

    Optional<Article> max = articles.stream()
            .max(Comparator.comparing(Article::getCreated));

    Set<Article> articles1 = max.map(date -> articles
            .stream()
            .filter(art -> art.getCreated().equals(date))
            .collect(Collectors.toSet()))
            .orElse(Collections.emptySet());

Whad do you think?

Pulkownik
  • 530
  • 1
  • 6
  • 21