3

This is not same as "How to check if a string contains s specific substring?" . I found a no of questions like that here but not precisely what i am looking for. I am creating a program at a competitive coding site the problem of which states that we are given a string made of x,y,z and we have to count the number of substrings which contains atleast one of those chars but not all of them.I tried this...

    String text = sc.next();
      int l = text.length();
      int count=0;
     for(int j =1;j<=l;j++)
      {
     for(int i1 =0;i1<j;i1++){
        String g = text.substring(i1,j);
        if(g.contains("xyz")||g.contains("xzy")||g.contains("yzx")||g.contains("yxz")||g.contains("zxy")||g.contains("zyx"))
        ;
        else
        count++;




      }

      }
      System.out.println(count);

And this worked(atleast for 2 test cases). But for the larger test cases my program is violating the time limit. Now i think that is because of the number of matching conditions in the if clause. I would like to know if there is any way by which i can just check if the substring contains 'xyz' in any order instead of checking for every order.Thanks! Any help is appreciated.

P.S- If anything else is responsible for the time limit violation, do mention out !

Kirty Bhushan
  • 147
  • 2
  • 12
  • 1
    The timing problem comes from the fact that you're using a quadratic solutions, with two nested for loops. This type of solution has a O (n^2) time, while a better one is needed. – Iamsomeone Jan 27 '15 at 10:10
  • 1
    Please post some input strings with awaited results – Gren Jan 27 '15 at 10:23
  • The input strings come from input files in which all the cases are tested ! I can't post them right now as i dn hav access to my laptop right now. I will provide that later. As for now, can you provide an alternate to nested for loop ? @lamsomeone & – Kirty Bhushan Jan 27 '15 at 10:26
  • Actually, this solution is probably O(n^3) since the `contains` method is probably an O(n) linear search. – Emil Lundberg Jan 27 '15 at 10:31
  • You are splitting this input into very small pieces (text.substring(i1,j);). Why don't you try your contains method on the full text, that will save you a lot of iterations. – Guillaume Jan 27 '15 at 10:32
  • Sounds like a candidate for CodeGolf: http://codegolf.stackexchange.com/ – slartidan Jan 27 '15 at 10:42
  • I thought it was a simple regexp? pattern `[xyz]+` ? – ha9u63a7 Jan 27 '15 at 11:02

1 Answers1

0

Here is a simple solution. Warning untested code!

String target = ...

String text = sc.next();
if (target.length() == 0) {
    // matched!
} else {
    for (int i = 0; i <= text.length() - target.length(); i++) {
        if ((pos = target.indexOf(text.charAt(i))) >= 0) {
            boolean[] found = new boolean[target.length()];
            found[pos] = true;
            int matchCount = 1;
     outer: for (j = 1; j < target.length(); j++) {
                pos = 0;
                while (true) {
                    pos = target.indexOf(text.charAt(i + j), pos);
                    if (pos == -1) {
                       break outer;
                    } else if (!found[pos]) {
                       found[pos] = true;
                       matchCount++;
                       break;
                    }
                }
            }
            if (matchCount == target.length()) {
                // matched!
            }
        }
    }
}

If you wanted to make this faster, one possibility is to clear and recycle the found array that we are using "mark off" the characters as we match them.

There may be more significant optimizations. However, I don't think that the Boyer-Moore "trick" of skipping a number of characters is going to work here.

UPDATE

Your original solution is O(N factorial(M)) where N is the text length, and M is the target string length.

My solution is O(N M).

The optimal solution involves calculating a running hash by multiplying prime numbers, as described in one of the answers to How to find all permutations of a given word in a given text?. I think it is O(N) on average.

(Hint: the solution I'm referring to as written looks like it is O(M N). However, we should be able to use the inverse multiplication equality:

((ab mod n) . (b-1 mod n)) mod n = a mod n

where a and b are the prime factors and n is 232 or 264 depending on whether we use int or long. Reference: wikipedia.

This will allow is to "multiply in" and then "inverse multiply out" the characters and therefore update the running hash in O(1) operations.)

Community
  • 1
  • 1
Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • Sorry but Optimal solution is what i am looking for ! Your solution too contains a nested for. So that too would probably violate the time limit ! – Kirty Bhushan Jan 27 '15 at 10:45
  • @KirtyBhushan - You cannot get an *optimal* solution without 2 for loops here. One loop for going through the entire string and another to check for permutations of substring. It is as simple as that – TheLostMind Jan 27 '15 at 10:48
  • @KirtyBhushan - Feel free to optimize it then :-) – Stephen C Jan 27 '15 at 10:49
  • I get it ! I too tried to get an answer without the nested loop but didn't succeed ! But then, i too can just go thru the solutions posted by others ! I am trying to do whatever on my own ! Anyways , thanks for the help. I will try to find an optimal sol. – Kirty Bhushan Jan 27 '15 at 10:52
  • @KirtyBhushan - One option that can be tried is *backtracking / recursion*. Or reduce your input size using `indexOf` and `lastIndexOf()` . But worst case, well. nothing can be done – TheLostMind Jan 27 '15 at 10:54
  • @TheLostMind - I don't think recursion or backtracking will help ... – Stephen C Jan 27 '15 at 10:59
  • @StephenC - It could help in *generating* permutations which could then be compared or used with `indexOf()` – TheLostMind Jan 27 '15 at 11:00
  • Actually, this is a duplicate question, and the linked question contains a solution that is `O(N)` on average. Clever. Read it. – Stephen C Jan 27 '15 at 11:03
  • @TheLostMind - However generating the permutations is `O(factorial(M))` in time and space, and you still have `O(N M)` work to do to look for a match (assuming you use a `HashSet`) – Stephen C Jan 27 '15 at 11:19
  • @StephenC - agreed. It could be improved upon slightly by using caching and *backtracking* (although many stack frames could actually slow down the process. ).. If we have some idea about the *pattern* of input, then a better algorithm / tweak can be proposed. – TheLostMind Jan 27 '15 at 11:25