1

Obviously, one could just use a for loop that checks every element of both arays, but I would like my program to be fast, so I would like to have a faster solution.

For example:
Array a: [1,3,2,10,12]
Array b: [1,3,2,10,12]
The arrays are identical

Whereas
Array c: [1,3,2,10,12]
Array d: [2,1,3,12,10]
The arrays are NOT identical

Does the '=' operator do the trick or is it harder?

François Andrieux
  • 28,148
  • 6
  • 56
  • 87
StLuke5
  • 130
  • 1
  • 10
  • Do i understand correctly that you would like to compare two completely unsorted arrays for equality of all elements without looking at every element? – Philipp Lenk Oct 04 '19 at 19:03
  • @Phillip Lenk Yes, that is what I mean. I have added examples to clarify everything – StLuke5 Oct 04 '19 at 19:04
  • 2
    In that case you ask for the impossible(independent of language). Regardless of the order you compare elements in, they could always differ in one you have not looked at – Philipp Lenk Oct 04 '19 at 19:05
  • 3
    @StLuke5 If you have no information on the structure of the data (for example, if it's sorted or otherwise organized) than mathematically it's not possible to have a worst case better than looking at every byte. If you don't know anything about the data, it's always possible the difference lies in the last byte to would be compared. – François Andrieux Oct 04 '19 at 19:05
  • @PhilippLenk Thanks. Just one more question: Why dosen't the '=' operator work? (Do you know) – StLuke5 Oct 04 '19 at 19:07
  • 2
    @StLuke5 That question is a duplicate of [Comparing arrays for equality in C++](https://stackoverflow.com/questions/12866413/comparing-arrays-for-equality-in-c). – François Andrieux Oct 04 '19 at 19:08
  • You have to visit every element of the array to do this. That means you can't do it in less than O(N). Just do the naive check and you'll be fine. – NathanOliver Oct 04 '19 at 19:12
  • 1
    @StLuke5 the `=` operator is for assignment, not equality. – sweenish Oct 04 '19 at 19:13
  • @StLuke5 the trivial loop based comparison has O(1) average complexity - unless these arrays tend to have long common prefix. It is difficult to improve. Creating hash values allows to compare O(1) on average without this extra assumption. – ALX23z Oct 04 '19 at 22:09

1 Answers1

3

It's not possible to ensure the equality of two arrays in less than O(N) time

Certainly, most algorithms are smart enough to "return early" upon realizing that two elements are not equal:

bool is_equal(std::array<int, 10> const& a, std::array<int, 10> const& b) {
    for(size_t index = 0; index < 10; index++)
        if(a[index] != b[index])
            return false; //Return Early
    return true;
}

But in order to verify that the arrays are equal, each individual element must be compared to their corresponding element in the other array. There's no way around this.

You could try to be sneaky, by precomputing hashes for the arrays at the time you create them.

auto[array, hash] = create_array();
auto[array2, hash2] = create_array(opt);

if(hash != hash2) {
    std::cout << "Arrays are not equal in O(1) time!" << std::endl;
} else {
    std::cout << "Arrays *might be* equal in O(1) time!" << std::endl;
}

But note how I've hedged with the equality check. Even if you have a fast way to compute the hash of the array, you can't verify the arrays are equal; you can only verify they are unequal. If the hash is big enough that it is 100% guaranteed to be unique for every possible set of values the array might contain, then comparing those two hashes together will still be proportional to the sizes of the array. This might be feasible if the domain of the array is extremely constrained (like we know the array can only contain a very small subset of known values for each index) but that's only applicable to a very small subset of problems, which probably does not represent your problem domain.

You could still get a speedup with the hash:

auto[array, hash] = create_array();
auto[array2, hash2] = create_array(opt);

if(hash != hash2) {
    std::cout << "Arrays are not equal in O(1) time!" << std::endl;
} else {
    if(array == array2)
        std::cout << "Arrays *are definitely* equal (... but in O(N) time...)" << std::endl;
    else
        std::cout << "Arrays are not equal in O(N) time." << std::endl;
}

But note you still haven't broken O(N) time, you've simply improved your best-case runtime.

So no, no matter how you try to cheat the code, it's not possible to have an accurate equality comparison check between two arrays without having a runtime of O(N).

Community
  • 1
  • 1
Xirema
  • 19,889
  • 4
  • 32
  • 68
  • 1
    Probably worth noting that any useful hash function will itself have O(N) time complexity, so it might help average time if the same arrays need to be compared repeatedly, but won't help time of comparing a brand new or modified array. – aschepler Oct 04 '19 at 19:39
  • Note that while complexity of the check is O(N), average complexity is O(1) for brute force method - at least for sufficiently random array. Example: quick sort has complexity of O(N^2) but typical runtime is O(N log N). – ALX23z Oct 04 '19 at 22:02