0

I am trying to find the number of substrings in a given string. Currently, My code doesn't take account of over lapping strings.

For example

substr = "cde" str = "cdcde"

My code:

public static int ssCount(String str, String substr) {
    int count = 0;
    int strlen = str.length();
    int substrlen = substr.length();
    int numsubstr = 0;
    int substrpointer = 0;

    for (int i = 0; i < strlen; i++) {
        if (str.charAt(i) == substr.charAt(substrpointer)) {
            substrpointer++;
            count++;
        }
        else {
            count = 0;
            substrpointer = 0;
        }
        if (count == substrlen) {
            numsubstr++;
            count = 0;
        }
    }
    return numsubstr;
    }

My attempt:

public static int ssCount(String str, String substr) {
        int count = 0;
        int strlen = str.length();
        int substrlen = substr.length();
        int numsubstr = 0;
        int substrpointer = 0;
        int firstchar = 0;

        for (int i = 0; i < strlen; i++) {
            if (str.charAt(i) == substr.charAt(substrpointer)) {
                substrpointer++;
                count++;
                if (str.charAt(i) == substr.charAt(0)) {
                    firstchar = i;
                }
            }
            else {
                count = 0;
                substrpointer = 0;
                i = firstchar;
            }
            if (count == substrlen) {
                numsubstr++;
                count = 0;
            }
        }
        return numsubstr;
    }

I tried to add a 2nd pointer that will point to the next occurrence of the first character of the substring in order to continue the comparisons from that spot. However I am having trouble because I can run into some infinite loops.

Liondancer
  • 15,721
  • 51
  • 149
  • 255

2 Answers2

2

This finds all overlapping sub-strings in a larger string. The non-regex way followed by the regex way. An interesting problem.

import  java.util.regex.Pattern;
import  java.util.regex.Matcher;

 /**
    <P>{@code java OverlappingSubstringsXmpl}</P>
  **/
 public class OverlappingSubstringsXmpl  {
    public static final void main(String[] igno_red)  {
      String sToFind = "cdc";
      String sToSearch = "cdcdcdedcdc";

      System.out.println("Non regex way:");

         int iMinIdx = 0;
         while(iMinIdx <= (sToSearch.length() - sToFind.length()))  {
            int iIdxFound = sToSearch.indexOf(sToFind, iMinIdx);

            if(iIdxFound == -1)  {
               break;
            }

             System.out.println(sToFind + " found at index " + iIdxFound);

            iMinIdx = iIdxFound + 1;
         }

      System.out.println("Regex way:");

         Matcher m = Pattern.compile(sToFind, Pattern.LITERAL).matcher(sToSearch);
         boolean bFound = m.find();
         while (bFound) {
            System.out.println(sToFind + " found at index " + m.start());
            bFound = m.find(m.start() + 1);
         }
   }
}

Output:

[C:\java_code\]java OverlappingSubstringsXmpl
Non regex way:
cdc found at index 0
cdc found at index 2
cdc found at index 8
Regex way:
cdc found at index 0
cdc found at index 2
cdc found at index 8
aliteralmind
  • 19,847
  • 17
  • 77
  • 108
1

Not sure what your question is, probably how to fix your code, but my recommendation is to look into standard approaches to solving this problem, such as the KMP algorithm. It efficiently takes into account overlaps too.

Martin Dinov
  • 8,757
  • 3
  • 29
  • 41