1

I have a complicated requirement where a list records has comments in it. We have a functionality of reporting where each and every change should be logged and reported. Hence as per our design, we create a whole new record even if a single field has been updated.

Now we wanted to get history of comments(reversed sorted by timestamp) stored in our db. After running query I got the list of comments but it contains duplicate entries because some other field was changed. It also contains null entries.

I wrote the following code to remove duplicate and null entries.

List<Comment> toRet = new ArrayList<>();
dbCommentHistory.forEach(ele -> {

        //Directly copy if toRet is empty.
        if (!toRet.isEmpty()) {
            int lastIndex = toRet.size() - 1;
            Comment lastAppended = toRet.get(lastIndex);

            // If comment is null don't proceed
            if (ele.getComment() == null) {
                return;
            }

            // remove if we have same comment as last time
            if (StringUtils.compare(ele.getComment(), lastAppended.getComment()) == 0) {
                toRet.remove(lastIndex);
            }
        }

        //add element to new list
        toRet.add(ele);
    });

This logic works fine and have been tested now, But I want to convert this code to use lambda, streams and other java 8's feature.

Youcef LAIDANI
  • 55,661
  • 15
  • 90
  • 140
diwakarb
  • 543
  • 2
  • 9
  • 23
  • 1
    You have used lambdas in your code! I don't know what is its benefits to use stream here?! IMO just check `if (ele.getComment() == null){...}` at first. – Hadi J Mar 26 '19 at 07:21
  • Can you please show us what is the type of `dbCommentHistory` – Youcef LAIDANI Mar 26 '19 at 07:37
  • As @HadiJ said, check `if(ele.getComment() == null) return;` right at the beginning. Then, since you have ruled out `null` comments, there is no reason to use a 3rd party library method, whose only “advantage” is the questionable `null` handling. And don’t use `compare(…) == 0`, that’s what an ordinary `equals` is for: `if(ele.getComment().equals(lastAppended.getComment())) { … }`… – Holger Mar 26 '19 at 08:22
  • Can you clarify whether you want to remove all repeats or whether you want to remove consecutive repeats? . If the input sequence of comments is (A, A, A, B, C, C, A, D , A, C, C) what is the wanted output sequence: (A, B, C, D) or (A, B, C, A, D, A, C)? Based on the code and all the trouble go thrugh to compare with only the last comment it seems to be the latter, right? – Valentin Ruano Mar 26 '19 at 21:50
  • @YCF_L: It's the same i.e. List – diwakarb Mar 26 '19 at 21:54
  • @ValentinRuano: I just want to remove consecutive repeats. – diwakarb Mar 26 '19 at 21:55
  • @dwakarb Thanks for the clarification, I posted a solution that has the desired effect. – Valentin Ruano Mar 27 '19 at 06:36

3 Answers3

2

You can use the following snippet:

Collection<Comment> result = dbCommentHistory.stream()
        .filter(c -> c.getComment() != null)
        .collect(Collectors.toMap(Comment::getComment, Function.identity(), (first, second) -> second, LinkedHashMap::new))
        .values();

If you need a List instead of a Collection you can use new ArrayList<>(result).

If you have implemented the equals() method in your Comment class like the following

@Override
public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    return Objects.equals(comment, ((Comment) o).comment);
}

you can just use this snippet:

List<Comment> result = dbCommentHistory.stream()
        .filter(c -> c.getComment() != null)
        .distinct()
        .collect(Collectors.toList());

But this would keep the first comment, not the last.

Samuel Philipp
  • 10,631
  • 12
  • 36
  • 56
  • The problem with this last solution is that it would remove any repeats when the original code would keep repeated comments that have some different comment between them. He is only removing consecutive repeats. – Valentin Ruano Mar 26 '19 at 21:29
  • @ValentinRuano yes, that's right, that's why I wrote "if you have implemented the `equals()` method". It should just be another option to be considered. – Samuel Philipp Mar 26 '19 at 21:35
  • not sure if we are on the same page here. What I wanted to say is that if the sequence of comments is A, A, B, B, A, A, B, B... the question code would result in A, B, A,B whereas the distinct would results in a A, B iff the equals is properly implemented. Iff equals is simply Object.equals it would return the original sequence. In neither case that is what the question code would return. – Valentin Ruano Mar 26 '19 at 21:41
  • I think that what he wants could be achieve using Stream.reduce although the docs it says that is not guaranted taht the execution is sequencial... more concretelly when two list are combined by the reduce... are these list composed of consecutive comments , i.e. is respecting the original Stream order, if not reduce won't work. Perhaps is just something that needs to be try out. Certainly it won't work on a parallel stream but perhaps it does in a sequencial stream, in practice. – Valentin Ruano Mar 26 '19 at 21:44
  • In the worse case scenario, the original question code can be improved ... it is far to complex for what it achieves, making difficult to interpret it. – Valentin Ruano Mar 26 '19 at 21:46
  • Yes, your explanation for the equals method is correct. – Samuel Philipp Mar 26 '19 at 21:47
  • Using `reduce()` to combine two lists is never a good idea. Use `collect()` instead. (See [Java 8 Streams - collect vs reduce](https://stackoverflow.com/q/22577197/9662601)) – Samuel Philipp Mar 26 '19 at 21:49
  • Aside performance issues, distict + collect simply don't return the desired result if the intention is to remove consecutive repeats only. – Valentin Ruano Mar 26 '19 at 21:53
  • Thanks for the tip, In fact one can achieve the desired effect using collect with a custom collider that remove the consecutive duplicates. I posted an answer explaining how. – Valentin Ruano Mar 27 '19 at 06:34
1

If I'm understanding the logic in the question code you want to remove consecutive repeated comments but keep duplicates if there is some different comment in between in the input list.

In this case a simply using .distinct() (and once equals and hashCode) has been properly defined, won't work as intended as non-consecutive duplicates will be eliminated as well.

The more "streamy" solution here is to use a custom Collector that when folding elements into the accumulator removes the consecutive duplicates only.

static final Collector<Comment, List<Comment>, List<Comment>> COMMENT_COLLECTOR = Collector.of(
   ArrayDeque::new, //// supplier.
   (list, comment) -> {  /// folder
        if (list.isEmpty() || !Objects.equals(list.getLast().getComment(), comment.getComment()) {
           list.addLast(comment);
        }
   }),
   (list1, list2) -> { /// the combiner. we discard list2 first element if identical to last on list1.
      if (list1.isEmpty()) {
         return list2;
      } else {
         if (!list2.isEmpty()) {
             if (!Objects.equals(list1.getLast().getComment(), 
                                 list2.getFirst().getComment()) {
                list1.addAll(list2);
             } else { 
                list1.addAll(list2.subList(1, list2.size());
             }
         }
         return list1;
      } 
   });  

Notice that Deque (in java.util.*) is an extended type of List that have convenient operations to access the first and last element of the list. ArrayDeque is the nacked array based implementation (equivalent to ArrayList to List).

By default the collector will always receive the elements in the input stream order so this must work. I know it is not much less code but it is as good as it gets. If you define a Comment comparator static method that can handle null elements or comment with grace you can make it a bit more compact:

static boolean sameComment(final Comment a, final Comment b) {
   if (a == b) {
      return true;
   } else if (a == null || b == null) {
      return false;
   } else {
      Objects.equals(a.getComment(), b.getComment());
   }
}

static final Collector<Comment, List<Comment>, List<Comment>> COMMENT_COLLECTOR = Collector.of(
   ArrayDeque::new, //// supplier.
   (list, comment) -> {  /// folder
        if (!sameComment(list.peekLast(), comment) {
           list.addLast(comment);
        }
   }),
   (list1, list2) -> { /// the combiner. we discard list2 first element if identical to last on list1.
      if (list1.isEmpty()) {
         return list2;
      } else {
         if (!sameComment(list1.peekLast(), list2.peekFirst()) {
            list1.addAll(list2);
         } else { 
            list1.addAll(list2.subList(1, list2.size());
         }
         return list1;
      } 
   });  


----------


Perhaps you would prefer to declare a proper (named) class that implements the Collector to make it more clear and avoid the definition of lambdas for each Collector action. or at least implement the lambdas passed to Collector.of by static methods to improve readability.

Now the code to do the actual work is rather trivial:

List<Comment> unique = dbCommentHistory.stream()
        .collect(COMMENT_COLLECTOR);

That is it. However if it may become a bit more involved if you want to handle null comments (element) instances. The code above already handles the comment's string being null by considering it equals to another null string:

List<Comment> unique = dbCommentHistory.stream()
        .filter(Objects::nonNull)
        .collect(COMMENT_COLLECTOR);
Valentin Ruano
  • 2,726
  • 19
  • 29
0

Your code can be simplified a bit. Notice that this solution does not use stream/lambdas but it seems to be the most succinct option:

List<Comment> toRet = new ArrayList<>(dbCommentHistory.size());
Comment last = null;
for (final Comment ele : dbCommentHistory) {
   if (ele != null && (last == null || !Objects.equals(last.getComment(), ele.getComment()))) {
      toRet.add(last = ele);
   }
}

The outcome is not exactly the same as the question code as in the latter null elements might be added to the toRet but it seems to me that you actually may want to remove the completely instead. Is easy to modify the code (make it a bit longer) to get the same output though.

If you insist in using a .forEach that would not be that difficult, in that case last whould need to be calculated at the beggining of the lambda. In this case you may want to use a ArrayDeque so that you can coveniently use peekLast:

Deque<Comment> toRet = new ArrayDeque<>(dbCommentHistory.size());
dbCommentHistory.forEach( ele -> {
   if (ele != null) {
     final Comment last = toRet.peekLast();
     if (last == null || !Objects.equals(last.getComment(), ele.getComment())) {
       toRet.addLast(ele);
     } 
   }
});
Valentin Ruano
  • 2,726
  • 19
  • 29