The task of the program is to check if a string s2 is a substring of another string (s1 + s1) given s1 and s2 which are equal in length.
For example: [s1, s2] = ["abc", "bca"] should return true while [s1, s2] = ["abc", "bac"] should return false.
and the limits of the length of both strings is 10^5.
using (s1+s1).find(s2) == string::npos
took about .1 second to finish.
I implemented it in a naive approach with a complexity of O(n*m) and it took 30 seconds!!
s1 = s1 + s1;
bool f = 0;
for(int i = 0;i < s1.size() - s2.size() + 1;i++){
f = 1;
for(int j = 0;j < s2.size();j++){
if(s1[i+j] != s2[j]){
f = 0;
break;
}
}
if(f)
break;
}
//the answer is f
So I thought that C++ used an algorithm like KMP but the I found out that it is implementation-defind and GNU gcc used only the naive approach with some improvements.
But that's not even the biggest problem.
when I replaced the inner loop with s1.compare(i, s2.size(), s2)
, it took about the same time as STL find method .1 second.
bool f = 0;
for(int i = 0;i < s1.size() - s2.size() + 1;i++){
if(s1.compare(i, s2.size(), s2) == 0){
f = 1;
break;
}
}
So does C++ compilers implement compare in a different way?
And what is the improvements of the method used by C++ compilers to surpass the naive approach by 300 times while using the same complexity?
NOTE: The test I used for the previous codes is
s1 = "ab"*50000 + "ca"
s2 = "ab"*50000 + "ac"