I've been wondering about worst-case time complexity of telling if two unordered arrays consist of same elements. Elements could be in any type. Numbers, strings, custom objects... etc, but let's assume that elements are both sortable and hashable.
I've thought of three methods, which are well-explained in this stackoverflow post. Which are 1) use hash 2) use sorting 3) just loop through.
The post said that it is possible to achieve worst-time O(n)
if data is hashable, however, I think that is not quite right since inserting and searching in Hash is not an worst-case O(1)
operation. It's O(1)
on average, if no collision occurs, but it's O(n)
in both inserting and searching (in theory). So If a lot of collision happens, using hash to tell two arrays are equal will cost O(n^2)
. (please correct me if I'm wrong.)
So it seems to me that telling two arrays are equal will cost as much as sorting the arrays, which, without any knowledge about the array, would cost O(nlogn)
. (with assumption that comparing two elements equal will always cost O(1)
)
Is it possible to tell two arrays are equal in worst-case O(n)? I'll appreciate any comments, duplicate flags, reference to a paper. Thanks!
Here's my code for comparing two arrays are equal. (It's in ruby and working, but please see it more like a pseudo code)
One. compare by hashing - on average, O(n)
, worst-case, O(n^2)
def compare_by_hashing(list1, list2)
hash1 = {}
list1.each do |item|
hash1[item] ||= 0
hash1[item] += 1
end
hash2 = {}
list2.each do |item|
hash2[item] ||= 0
hash2[item] += 1
end
hash1.each do |key, hash_1_value|
return false if hash_1_value != hash2[key]
end
return true
end
Two. compare by sorting. Worst-case O(nlogn)
# 2. compare by sorting. Worst-case `O(nlogn)`
def compare_by_sorting(list1, list2)
list1.sort
list2.sort
list1.each_with_index do |list_1_item, index|
return false if list_1_item != list2[index]
end
return true
end
Three. Compare by just looping through. Worst case O(n^2)
def compare_by_looping(list1, list2)
list1.each do |item|
if list2.include? item
list2.delete item
else
return false
end
end
return true
end
Edit
I appreciate and understand answers and comments that hash operations would normally show O(1) time complexity and worst-case scenarios are very unlikely to happen. However, since they can anyways happen, I do not want to ignore the possibilities. I apologize for not making my point clear. My first intention was to find theoretically proven O(n) algorithm, not practical algorithm. Thanks for your attention. I really appreciate it.