I am a bit confused, to be honest. I was working out the one of the classical algorithm problems. Given a collection of integers, find if are there 2 elements summing to a given number.
So I have implemented 2 solutions.
bool find1(std::vector<int>& V, int sum)
{
std::unordered_set<int> hashTable;
for (int i = 0; i < V.size(); ++i)
{
if (hashTable.find(V[i]) != hashTable.end())
{
return true;
}
hashTable.insert(sum - V[i]);
}
return false;
}
bool find2(std::vector<int>& V, int sum)
{
for (int i = 0; i < V.size() ; ++i)
{
if (std::binary_search(V.begin(), V.end(), sum - V[i]))
{
return true;
}
}
return false;
}
Find1 is expected to be a linear algorithm (depending on the load of buckets and efficiency of hashing function).
Find2 is expected to be NlogN, we loop and do a binary search for every iteration.
After implementing this function I tried to test the running times of these algos on a relatively big collection, and the result confused me..
int main()
{
std::vector<int> V(10000,0);
std::chrono::system_clock::time_point now1 = std::chrono::system_clock::now();
for (int i = 0; i < 100; ++i)
{
bool b = find1(V, 1000);
}
std::chrono::system_clock::time_point then1 = std::chrono::system_clock::now();
std::cout <<"Linear with hashing = "<< std::chrono::duration_cast<std::chrono::microseconds>(then1 - now1).count()<<std::endl;
std::chrono::system_clock::time_point now2 = std::chrono::system_clock::now();
std::sort(V.begin(), V.end());
for (int i = 0; i < 100; ++i)
{
bool b = find2(V, 1000);
}
std::chrono::system_clock::time_point then2 = std::chrono::system_clock::now();
std::cout <<"NlogN with binary_search = " <<std::chrono::duration_cast<std::chrono::microseconds>(then2 - now2).count() << std::endl;
system("pause");
}
Here I initialize the vector
with 0s to be sure that both algos run the worst case.
The output of the program is:
Linear with hashing = 6759245
NlogN with binary_search = 4508025
How this is possible? Can anyone please explain this to me?