70

I have two lists of arrays.

How do I easily compare equality of these with Java 8 and its features, without using external libraries? I am looking for a "better" (higher-level, shorter, more efficient) solution than brute-force code like this (untested code, may contain typos etc, not the point of the question):

boolean compare(List<String[]> list1, List<String[]> list2) 
{
    // tests for nulls etc omitted
    if(list1.size() != list2.size()) {
       return false;
    }
    for(i=0; i<list1.size(); ++i) {
        if(!Arrays.equals(list1.get(i), list2.get(i))) {
            return false;
        }
    }
    return true;
}

Or, if there isn't any nicer way, that's a valid answer too.

Bonus: If Java 9 offers an even better way what whaterver Java 8 can offer, feel free to mention it as well.

Edit: After looking at the comments, and seeing how this question has become moderately hot, I think the "better" should include first checking lengths of all arrays, before checking array contents, because that has potential to find inequality much quicker, if inner arrays are long.

hyde
  • 60,639
  • 21
  • 115
  • 176

5 Answers5

52

The for loop at least can be streamified, leading to:

return (list1.size()==list2.size() &&
        IntStream.range(0, list1.size())
                 .allMatch(i -> Arrays.equals(list1.get(i), list2.get(i)));
khelwood
  • 55,782
  • 14
  • 81
  • 108
  • 12
    Quite nice solution. Worth adding that it is inefficient for non `RandomAccess` lists. – Jaroslaw Pawlak Feb 15 '16 at 11:52
  • 10
    @cat, in the general case there's no way to compare two collections of n elements any faster than O(n). – ymbirtt Feb 15 '16 at 15:42
  • 17
    @ymbirtt: Actually, there's a way, but it'd require you to write your own collection or extend an existing one and modify it. Just keep a hash code on the collection. Every time an item is added, adjust the hash accordingly (e.g. multiply by a prime and add the item's hash code). Then all you have to do is compare hashes of 2 collections, which is O(1). Since there's a possibility of collision, you might optionally do a full compare only when the hashes match, so it'd be O(n) then but much faster the rest of the time. Also, deleting or editing items in the collection might be complicated. – Darrel Hoffman Feb 15 '16 at 17:43
  • 1
    `zip` is much more natural for such problems and it's a well-known idiom across many different languages - as presented in [hahn's answer](http://stackoverflow.com/a/35408718/2642204). Not to mention performance superiority in some cases (as mentioned by @JaroslawPawlak in the top comment). – BartoszKP Feb 16 '16 at 02:39
  • 8
    @DarrelHoffman For accurate comparison, you *must* compare every element if the hashes match. – user253751 Feb 16 '16 at 08:38
  • 1
    @immibis: I suppose the overall amortized performance depends on how often you're likely to compare matching vs. non-matching collections. Still, with a decent hash function, the odds of two non-matching collections having a matching hash are very small, so the overwhelming majority of non-matches can be determined very quickly. If you expect to get matches very frequently (or you intend to change/delete items regularly, which would probably require an O(n) rehashing), this won't work as well. – Darrel Hoffman Feb 16 '16 at 15:29
  • Beware the overhead of streams and lambdas. This can well be slower and consume more memory than the original solution. – Martin Schröder Feb 16 '16 at 21:52
  • @cat in case the size is different, it is `O(1)` in this solution – Khaled.K Feb 17 '16 at 18:22
  • @immibis no, you don't have to. It's enough to slighly modify both lists - e.g. add one empty element to both lists. If the function is really hash function and hashes are still the same, lists are equal :) One of three axioms of hash functions. – xenteros Aug 04 '17 at 13:13
31

using zip (which originates from lambda b93) function from https://stackoverflow.com/a/23529010/755183, code could look like:

boolean match = a.size() == b.size() && 
                zip(a.stream(), b.stream(), Arrays::deepEquals).
                allMatch(equal -> equal)

update

in order to check size of arrays first and then content this could be a solution to consider

final boolean match = a.size() == b.size() 
                   && zip(a.stream(), b.stream(), (as, bs) -> as.length == bs.length).
                      allMatch(equal -> equal)
                   && zip(a.stream(), b.stream(), Arrays::deepEquals).
                      allMatch(equal -> equal);
Community
  • 1
  • 1
hahn
  • 3,588
  • 20
  • 31
31

1) Solution based on Java 8 streams:

List<List<String>> first = list1.stream().map(Arrays::asList).collect(toList());
List<List<String>> second = list2.stream().map(Arrays::asList).collect(toList());
return first.equals(second);

2) Much simpler solution (works in Java 5+):

return Arrays.deepEquals(list1.toArray(), list2.toArray());

3) Regarding your new requirement (to check the contained String arrays length first), you could write a generic helper method that does equality check for transformed lists:

<T, U> boolean equal(List<T> list1, List<T> list2, Function<T, U> mapper) {
    List<U> first = list1.stream().map(mapper).collect(toList());
    List<U> second = list2.stream().map(mapper).collect(toList());
    return first.equals(second);
}

Then the solution could be:

return equal(list1, list2, s -> s.length)
    && equal(list1, list2, Arrays::asList);
Dragan Bozanovic
  • 23,102
  • 5
  • 43
  • 110
  • 5
    #2 is simpler but also creating a copy of each list… – Holger Feb 15 '16 at 13:49
  • 2
    @Holger True, actually the first solution involves copying the lists also. But the contained String arrays are not copied in any of the two solutions, only references to them are copied. – Dragan Bozanovic Feb 15 '16 at 14:01
  • I edited the question a bit, if you want to check that... I think deepEquals doesn't do that, does it? – hyde Feb 16 '16 at 09:36
  • 2
    @hyde `deepEquals` does check for length. From the javadoc: *Two array references are considered deeply equal if both are null, or if they refer to arrays that contain **the same number of elements** and all corresponding pairs of elements in the two arrays are deeply equal.* – SpaceTrucker Feb 16 '16 at 10:22
  • @SpaceTrucker Yeah, but I mean, it probably does not first check lengths of *all* arrays etc, and only after that start checking the contents of them. It would assume it checks contents of first encountered array before checking the lengths of second. – hyde Feb 16 '16 at 11:20
  • @hyde The source code is accessible so why not check yourself. Some old [`open-jdk` version](http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/Arrays.java#Arrays.deepEquals%28java.lang.Object[]%2Cjava.lang.Object[]%29). IMO this have to be the accepted answer especially because of #2. – zloster Feb 21 '16 at 13:59
15

You could use a stream if the lists are random access lists (so that a call to get is fast - generally constant time) leading to:

//checks for null and size before
boolean same = IntStream.range(0, list1.size()).allMatch(i -> Arrays.equals(list1.get(i), list2.get(i)));

However, you might give as parameters some implementations that are not (such as LinkedLists). In this case, the best way is to use the iterator explicitly. Something like:

boolean compare(List<String[]> list1, List<String[]> list2) {

    //checks for null and size

    Iterator<String[]> iteList1 = list1.iterator();
    Iterator<String[]> iteList2 = list2.iterator();

    while(iteList1.hasNext()) {
        if(!Arrays.equals(iteList1.next(), iteList2.next())) {
            return false;
        }
    }
    return true;
}
Alexis C.
  • 91,686
  • 21
  • 171
  • 177
  • The first solution will return true if the second list has more elements. Second solution is inefficient for large lists of different sizes, as it will return false only after iterating over them. – Jaroslaw Pawlak Feb 15 '16 at 11:50
  • @JaroslawPawlak The first solution is assuming that the checks for null references and the size are already done (just replacing the for loop). I edited for the second solution. – Alexis C. Feb 15 '16 at 11:54
  • 1
    Fair enough. In such case we can simplify final return statement in second solution to `return true;` :) – Jaroslaw Pawlak Feb 15 '16 at 19:24
  • @JaroslawPawlak Yes, and we can even omit the `&& iteList2.hasNext()` in the condition – Alexis C. Feb 15 '16 at 19:40
6

You could stream over one list and compare to each element of the other by using an iterator:

Iterator<String[]> it = list1.iterator();
boolean match = list1.size() == list2.size() &&
                list2.stream().allMatch(a -> Arrays.equals(a, it.next()));

Using an iterator instead of the get(index) method on the first list is better because it doesn't matter whether the list is RandomAccess or not.

Note: this only works with a sequential stream. Using a parallel stream will lead to wrong results.


EDIT: As per the question last edit, which indicates it would be better to check the length of every pair of arrays in advance, I think it could be achieved with a slight modification to my previous code:

Iterator<String[]> itLength = list1.iterator();
Iterator<String[]> itContents = list1.iterator();

boolean match = 
        list1.size() == list2.size()
    && 
        list2.stream()
            .allMatch(a -> {
                String[] s = itLength.next();
                return s == null ? a == null :
                       a == null ? s == null :
                       a.length == s.length;
            })
    && 
        list2.stream()
            .allMatch(a -> Arrays.equals(a, itContents.next()));

Here I'm using two iterators and am streaming list2 twice, but I see no other way to check all lengths before checking the contents of the first pair of arrays. Check for lengths is null-safe, while check for contents is delegated to the Arrays.equals(array1, array2) method.

fps
  • 33,623
  • 8
  • 55
  • 110