I need to compute the longest common substrings from a set of filenames in C++.
Precisely, I have an std::list of std::strings (or the QT equivalent, also fine)
char const *x[] = {"FirstFileWord.xls", "SecondFileBlue.xls", "ThirdFileWhite.xls", "ForthFileGreen.xls"};
std::list<std::string> files(x, x + sizeof(x) / sizeof(*x));
I need to compute the n distinct longest common substrings of all strings, in this case e.g. for n=2
"File" and ".xls"
If I could compute the longest common subsequence, I could cut it out it and run the algorithm again to get the second longest, so essentially this boils down to:
Is there a (reference?) implementation for computing the LCS of a std::list of std::strings?
This is not a good answer but a dirty solution that I have - brute force on a QList of QUrls from which only the part after the last "/" is taken. I'd love to replace this with "proper" code.
(I have discovered http://www.icir.org/christian/libstree/ - which would help greatly, but I can't get it to compile on my machine. Someone used this maybe?)
QString SubstringMatching::getMatchPattern(QList<QUrl> urls)
{
QString a;
int foundPosition = -1;
int foundLength = -1;
for (int i=urls.first().toString().lastIndexOf("/")+1; i<urls.first().toString().length(); i++)
{
bool hit=true;
int xj;
for (int j=0; j<urls.first().toString().length()-i+1; j++ ) // try to match from position i up to the end of the string :: test character at pos. (i+j)
{
if (!hit) break;
QString firstString = urls.first().toString().right( urls.first().toString().length()-i ).left( j ); // this needs to match all k strings
//qDebug() << "SEARCH " << firstString;
for (int k=1; k<urls.length(); k++) // test all other strings, k = test string number
{
if (!hit) break;
//qDebug() << " IN " << urls.at(k).toString().right(urls.at(k).toString().length() - urls.at(k).toString().lastIndexOf("/")+1);
//qDebug() << " RES " << urls.at(k).toString().indexOf(firstString, urls.at(k).toString().lastIndexOf("/")+1);
if (urls.at(k).toString().indexOf(firstString, urls.at(k).toString().lastIndexOf("/")+1)<0) {
xj = j;
//qDebug() << "HIT LENGTH " << xj-1 << " : " << firstString;
hit = false;
}
}
}
if (hit) xj = urls.first().toString().length()-i+1; // hit up to the end of the string
if ((xj-2)>foundLength) // have longer match than existing, j=1 is match length
{
foundPosition = i; // at the current position
foundLength = xj-1;
//qDebug() << "Found at " << i << " length " << foundLength;
}
}
a = urls.first().toString().right( urls.first().toString().length()-foundPosition ).left( foundLength );
//qDebug() << a;
return a;
}