2

I am trying to make a very concise way (using streams, of course) to determine if two members of a collection (a list) are equal. But not equal in the sense of the method equals() but using a different method for comparing.

Basically, I have a list of timeslots, and these define an order. You can say if a timeslot precedes, passes or is equal to another using the Timeslot#compareTo(Timeslot) method. I need these timeslots to be in order, that is, every one of them precedes the following in the list, but I also need each one of them to be strictly greater than the next one. They cannot be equal (you cannot have timeslot1=10am and then timeslot2=10am).

I got the ordering covered, as easy as this:

Collections.sort(timeslots, Timeslot::compareTo);

However, I would like to be very concise when determining if every timeslot strictly precedes the following or not. Of course, I could do this the traditional way as follows:

for (int i = 0; i < timeslots.size() - 1; i++)
    if (timeslots.get(i).compareTo(timeslots.get(i + 1)) == 0)
        throw new IllegalArgumentException("Every timeslot must strictly precede the following");

Maybe I am asking too much of streams, and there is really no better and equally efficient way of achieving this, but I would be glad to hear out your ideas.

Alexis C.
  • 91,686
  • 21
  • 171
  • 177
dabadaba
  • 9,064
  • 21
  • 85
  • 155
  • 1
    Side note: assuming `Timeslot` implements `Comparable`, you should be able to do `Collections.sort(timeslots);`. – shmosel Jun 14 '16 at 20:48
  • If equals can be hashed. You want to do groupby hashing function and then see if there is at least one reduction happening. – Alexander Oh Jun 14 '16 at 20:50
  • I would prefer if I didn't have to override `equals`. – dabadaba Jun 14 '16 at 20:50
  • 4
    Put them in a TreeSet and check whether the TreeSet’s size matches the original List’s size. – VGR Jun 14 '16 at 20:53
  • Why are you only checking if they're equal? What if they're in the wrong order? – shmosel Jun 14 '16 at 20:54
  • @shmosel that's why I am reordering them first... it's the first line of code I typed. – dabadaba Jun 14 '16 at 21:18
  • 1
    Shouldn't `timeslots = new ArrayList<>(new TreeSet<>(timeslots));` do the trick of sorting **and** removing duplicates? (If not, then your `compareTo` method is likely broken and not consistent with `equals`...) – Marco13 Jun 14 '16 at 22:16
  • I see some people suggesting using `TreeSet`, why? Could it just be a `HashSet`? – dabadaba Jun 14 '16 at 22:18
  • @Marco13 I am using `compareTo` to define an order on the timeslots domain, which would be different than comparing the timeslots in terms of equality (method `equals` which I haven't implemented nor I plan to) – dabadaba Jun 14 '16 at 22:19
  • Then see the [documentation of `Comparable`](https://docs.oracle.com/javase/8/docs/api/java/lang/Comparable.html): *"It is strongly recommended (though not required) that natural orderings be consistent with equals."* - This means that it is **strongly** recommended to make sure that `a.compareTo(b) == 0` is true if and only if `a.equals(b)`. (And if you did this, the line that I posted would solve your problem, by the way). But of course, you *can* solve it differently, if there are compelling reasons to do so... – Marco13 Jun 15 '16 at 10:46
  • @Marco13: the `TreeSet` will do the desired thing, even if the order is not consistent with `equals`. Otherwise, things like `new TreeSet(String.CASE_INSENSTIVE_ORDER)` wouldn’t work. Of course, such a set wouldn’t play together with, e.g. a `HashSet`, when it comes to set equality, but there is no reason to incorporate a `HashSet` here… I made [an answer](http://stackoverflow.com/a/37834260/2711488) containing that solution (before reading your comment, by the way). – Holger Jun 15 '16 at 11:36
  • @Holger It may be worth mentioning that the fact that it would work with a `TreeSet` in this particular case, although the set would no longer obey the `Set` contract. – Marco13 Jun 15 '16 at 12:52
  • My answer below eliminates the need to override equals, removes the need to pre-sort, and returns a stream for future necessary operations. Since it operates from a Stream, it also means that the original data can also be provided ... as a Stream. And it is simple and concise thanks to the Stream::sorted method. Give it a try! – The Coordinator Jun 16 '16 at 01:32

4 Answers4

4

Maybe you are thinking too complicated by trying to find a Stream API solution. The task can be easily solved using basic Collection API operations:

List<Timeslot> timeslots;// the source

TreeSet<Timeslot> sorted=new TreeSet<>(timeslots);
if(sorted.size() != timeslots.size())
  throw new IllegalArgumentException("Every timeslot must strictly precede the following");
// get a list where every timeslot strictly precedes the following:
timeslots=new ArrayList<>(sorted);

The equality of the TreeSet’s elements is only determined by the order. So if there are duplicates according to the order, and this is what your problem is all about, they are removed when converting the list to the set. So if the size differs, there were such duplicates. If not, you can convert the sorted set back to a list, if you need a list. Otherwise, you can just continue using the set instead of the list.


It should be mentioned, since you stated to have a ordering which is not consistent with equals, that the TreeSet will also not be fully compliant with the Set contract as stated in its documentation:

Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. […] This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.

Since your operation does not rely on the general contract of the Set interface, as there was no Set before, this is not a problem for this specific use case. Converting to a List as shown above works and so would iterating over the elements, in case you use it without conversion, i.e. don’t need specific list operations.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • I actually used the size comparison to check for duplicated elements in a list, wrapping it in a `HashSet` of course. But I didn't know `TreeSet` used `compareTo` so I never came up with this solution. Although this is want I am going to use in my project, I am not sure I can accept this answer because it is not related to streams, which is what my question was about. – dabadaba Jun 16 '16 at 08:54
  • @dabadaba: this wouldn’t be the first Stream related question where the correct answer is “Streams do not improve this”. It’s up to you to decide, whether a practical solution or a Stream usage is more important. After all, the Stream API is a tool, not an end in itself… – Holger Jun 16 '16 at 09:01
2

I think sorting the list and comparing adjacent elements is a reasonable approach. Here's the streams-based code that does that:

    timeslots.sort(null);  // sort the list using natural order
    if (IntStream.range(1, timeslots.size())
                 .map(i -> timeslots.get(i-1).compareTo(timeslots.get(i)))
                 .anyMatch(i -> i == 0)) {
        throw new IllegalArgumentException("...");
    }

It's no shorter than what you have, but the streams-based approach might offer more flexibility: it can be run in parallel, you can count the number of duplicate timestamps, etc.

You could also inline the compareTo() call into the anyMatch() call, like so:

    if (IntStream.range(1, timeslots.size())
                 .anyMatch(i -> timeslots.get(i-1).compareTo(timeslots.get(i)) == 0) {

Whether this is preferable is, I think, a matter of style.

Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
  • How does this work? I am specially confused by the `anyMatch` predicate. PS: what's with the `timeslots.sort(null)`? – dabadaba Jun 15 '16 at 18:49
  • @dabadaba The map() call runs compareTo() on each adjacent pair, resulting in a stream of ints which are the results of compareTo(). These values are then fed to anyMatch() which returns true if any value is zero. You could inline the compareTo() call into anyMatch(). Not sure if that's any clearer, though. – Stuart Marks Jun 15 '16 at 21:02
  • @dabadaba List.sort(null) sorts the elements of the list using their "natural order" which means that the elements must implement `Comparable` and thus implement the `compareTo()` method. You can also supply a `Comparator` as the argument instead of null. – Stuart Marks Jun 15 '16 at 21:03
  • I absolutely get it now, I wonder if I will eventually get confident with streams so I can understand them withouth asking :) I actually have no problem understanding using `compareTo` directly inside `anyMatch` instead of taking the feed from the `map` part, and I'd like to say I see this last option as better because it's more concise. In terms of efficiency are they any different? And are the streams solutions to my problem more, less or as efficient than my original, traditional solution? – dabadaba Jun 15 '16 at 21:16
  • @dabadaba I doubt that the different streams formulations have much of an efficiency difference between themselves or with the for-loop. Well, the for-loop is probably cheaper since there is some fixed overhead to creating a stream, but you'd have to work pretty hard to measure the difference. I would say that readability/maintainability are more important here. In that light I factored the first example the way I did because I think of doing the comparison and testing the result of the comparison as separate operations. Inlining everything into `anyMatch` saves a line but complicates the... – Stuart Marks Jun 16 '16 at 18:30
  • ...expression, which in my opinion makes it harder to understand. It's kind of like factoring methods: do you like little methods that perform fine-grained operations, or do you inline them into larger methods that perform coarser-grained operations? Again, mainly a matter of style. – Stuart Marks Jun 16 '16 at 18:32
2

Simple — Just use Stream 'sorted' and make custom comparator.

See https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#sorted-java.util.Comparator-

Something like this:

Timeslots timeslots = ... 

Stream<Timeslots> tstream  = timeslots
     .stream()
     .sorted( (a,b) -> {
          int compare = a.compareTo(b);
          if(compare==0){
               throw new IllegalArgumentException("Every timeslot must strictly precede the following");
          }
          return compare;
     });

And yes ... you need a terminal operation for the Stream to actually run. I assume that there is a next operation or something you feed the Timeslots to.

With this method you get a Stream result and there is no need to pre-sort the Timeslots beforehand.

Give it a try!

The Coordinator
  • 13,007
  • 11
  • 44
  • 73
  • Without a terminal operation, this Stream will do nothing. Besides that, it’s a bet on implementation details. There is no guaranty that the underlying sort algorithm will never compare an element to itself. – Holger Jun 16 '16 at 08:48
  • The OP used `compare` so that is why I did. It would be hard to sort based on 'equals' and I assume that timestamps are sortable. Of course, you need to provide this to something else or make it a terminal operation. But that is the beauty of streams ... you don't need to work on it ... until you have to work on it. – The Coordinator Jun 16 '16 at 22:11
  • I don’t understand your comment. I didn’t say anything about `equals`. – Holger Jun 17 '16 at 07:33
0

I'm not sure this is much more concise than your version, but here's how you could do it with streams:

timeslots.stream().reduce((a, b) -> {
    if (a.compareTo(b) == 0) {
        throw new IllegalArgumentException("Every timeslot must strictly precede the following");
    }
    return b;
});

Another solution, inspired by @VGR's comment:

timeslots.stream().collect(Collectors.toMap(Function.identity(), Function.identity(), (a, b) -> {
    throw new IllegalArgumentException("Every timeslot must strictly precede the following");
}, () -> new TreeMap<>(Timeslot::compareTo)));
shmosel
  • 49,289
  • 6
  • 73
  • 138
  • Can you guarantee that throws the right exception? I've heard reports of that not working so well before. – Makoto Jun 14 '16 at 21:33
  • @Makoto I'm not sure what reports you're referring to, or how my guarantees would help, but as far as I understand (and from my testing), these should work fine. – shmosel Jun 14 '16 at 21:36
  • I was likely thinking of the [checked exception debacle](http://stackoverflow.com/q/27644361/1079354), but you're likely right since that is an unchecked exception. – Makoto Jun 14 '16 at 21:39
  • This `reduce` based solution only works with a strict left-to-right evaluation, which isn’t guaranteed. E.g. trying `Stream.of(0, 1, 0, 5).parallel().reduce((a, b) -> /* your function */);` will not notice that the elements aren’t ordered, as soon as you have more than one CPU core… – Holger Jun 15 '16 at 11:55
  • @Holger I was waiting for someone to call me out on that. My assumption was that it wasn't illegal, just frowned upon. Perhaps I was wrong. An alternative might be `Collectors.reducing()`, since collectors definitely have a concept of encounter order (as long as the UNORDERED characteristic is not specified, which is the case here, although I'm not sure why that is, and it's not explicitly documented). – shmosel Jun 15 '16 at 17:07
  • There is no difference between `Stream.reduce` and `Collectors.reducing`. The point is that the encounter order *is* respected, but the reduction function is required to be *associative*. That implies, the reduction function must guaranty that `f( f(a,b), c)` has the same result as `f( a, f(b,c) )` (*which isn’t the case here*), but the Stream will never evaluate `f(b, a)` nor `f(c, a)` or such alike for ordered streams. To name an example, `String::concat` would be a reduction function which relies on the encounter order, but is associative, therefore correct and works with parallel streams. – Holger Jun 15 '16 at 17:21
  • @Holger Would a sequential stream (as in my solution) ever actually evaluate `f(a,f(b,c))`, or is your issue just in principal? – shmosel Jun 15 '16 at 17:32
  • There is no practical reason for a sequential stream to evaluate `f(a,f(b,c))` (yet) and it’s hard to imagine such a scenario. Still, there is no guaranty that it never will and using a non-associative function is an incorrect API usage. In other words, it will work with the current implementation and it’s tempting to assume that it always will (as long as no-one ever accidentally switches to parallel). But I remember, before the introduction of TimSort in Java 7’s implementation of `Collections.sort` and `Arrays.sort`, `Comparator`s violating the contract happened to work too… – Holger Jun 15 '16 at 17:51