I'm trying to find a way to find the largest duplicate substring in a group of strings. The longest duplicate substring problem usually applies to a single string, instead of a group of strings. What type of algorithm would be useful for finding the largest duplicate substring in a group of strings?
Finding the largest duplicate string in a group of files (in order to remove duplicate code in large software libraries) is the main use case that I have in mind, but there would be many other use cases for this algorithm as well.
For example, I'd want to find the longest duplicate substring in this group of strings:
"Hello world, this is the first string."
"Hello to the world, this is the second string."
"Hello world. This is the third string."
"This is the third string."
In this case, "This is the third string."
would be the longest repeated string (i. e., the longest string that appears in more than one of these strings).