4

I'm using this piece of Java code to find similar strings:

if( str1.indexof(str2) >= 0 || str2.indexof(str1) >= 0 ) .......

but With str1 = "pizzabase" and str2 = "namedpizzaowl" it doesn't work.

how do I find the common substrings i.e. "pizza"?

  • Do you want to find all of the common substrings, or only the [longest common substring](https://stackoverflow.com/questions/17150311/java-implementation-for-longest-common-substring-of-n-strings)? – Anderson Green Apr 17 '22 at 20:22

2 Answers2

2

Iterate over each letter in str1, checking for it's existence in str2. If it doesn't exist, move on to the next letter, if it does, increase the length of the substring in str1 that you check for in str2 to two characters, and repeat until no further matches are found or you have iterated through str1.

This will find all substrings shared, but is - like bubble sort - hardly optimal while a very basic example of how to solve a problem.

Something like this pseudo-ish example:

pos = 0
len = 1
matches = [];

while (pos < str1.length()) {

    while (str2.indexOf(str1.substring(pos, len))) {
       len++;
    }

    matches.push(str1.substring(pos, len - 1));
    pos++;
    len = 1;
}
nikc.org
  • 16,462
  • 6
  • 50
  • 83
  • @fateme: my provided code is not valid Java code, it is a pseudo code example that you need to understand and then implement. – nikc.org Oct 28 '10 at 09:24
  • @fateme: I had mistakenly left out resetting the `len` variable in my example. It's now in there. – nikc.org Oct 29 '10 at 06:45
0

If your algorithm says two strings are similar when they contain a common substring, then this algorithm will always return true; the empty string "" is trivially a substring of every string. Also it makes more sense to determine the degree of similarity between strings, and return a number rather than a boolean.

This is a good algorithm for determining string (or more generally, sequence) similarity: http://en.wikipedia.org/wiki/Levenshtein_distance.

Tom Crockett
  • 30,818
  • 8
  • 72
  • 90