6

Ok, this is an interview question. And no it's not a duplicate of this question.

Given 3 strings - str1, str2, str3:

str1 = "spqrstrupvqw"
str2 = "sprt"
str3 = "q"

We've to find the minimum window in str1, which contains all characters from str2 in any order, but no character from str3. In this case the answer would be: "strup".

I've come up with this code:

static String minimumWindow(String str1, String str2, String str3) {

        class Window implements Comparable<Window> {
            int start;
            int end;

            public Window(int start, int end) {
                this.start = start;
                this.end = end;
            }

            public int getEnd() {
                return end;
            }

            public int getStart() {
                return start;
            }

            public int compareTo(Window o) {
                int thisDiff = end - start;
                int thatDiff = o.end - o.start;

                return Integer.compare(thisDiff, thatDiff);
            }

            @Override
            public String toString() {
                return "[" + start + " : " + end + "]";
            }
        }

        // Create Sets of characters for "contains()" check

        Set<Character> str2Chars = new HashSet<>();
        for (char ch: str2.toCharArray()) {
            str2Chars.add(ch);
        }

        Set<Character> str3Chars = new HashSet<>();
        for (char ch: str3.toCharArray()) {
            str3Chars.add(ch);
        }

        // This will store all valid window which doesn't contain characters
        // from str3.
        Set<Window> set = new TreeSet<>();

        int begin = -1;

        // This loops gets each pair of index, such that substring from 
        // [start, end) in each window doesn't contain any characters from str3
        for (int i = 0; i < str1.length(); i++) {
            if (str3Chars.contains(str1.charAt(i))) {
                 set.add(new Window(begin, i));
                 begin = i + 1;
            }
        }

        int minLength = Integer.MAX_VALUE;
        String minString = "";

        // Iterate over the windows to find minimum length string containing all
        // characters from str2
        for (Window window: set) {
            if ((window.getEnd() - 1 - window.getStart()) < str2.length()) {
                continue;
            }

            for (int i = window.getStart(); i < window.getEnd(); i++) {
                if (str2Chars.contains(str1.charAt(i))) {
                      // Got first character in this window that is in str2
                      // Start iterating from end to get last character
                      // [start, end) substring will be the minimum length
                      // string in this window
                     for (int j = window.getEnd() - 1; j > i; j--) {
                        if (str2Chars.contains(str1.charAt(j))) {
                            String s = str1.substring(i, j + 1);

                            Set<Character> sChars = new HashSet<>();
                            for (char ch: s.toCharArray()) {
                                sChars.add(ch);
                            }

                            // If this substring contains all characters from str2, 
                            // then only it is valid window.
                            if (sChars.containsAll(str2Chars)) {
                                int len = sChars.size();
                                if (len < minLength) {
                                    minLength = len;
                                    minString = s;
                                }
                            }
                        }
                    }
                }
            }
        }

    // There are cases when some trailing and leading characters are
    // repeated somewhere in the middle. We don't need to include them in the
    // minLength. 
    // In the given example, the actual string would come as - "rstrup", but we
    // remove the first "r" safely.
        StringBuilder strBuilder = new StringBuilder(minString);

        while (strBuilder.length() > 1 && strBuilder.substring(1).contains("" + strBuilder.charAt(0))) {
            strBuilder.deleteCharAt(0);
        }

        while (strBuilder.length() > 1 && strBuilder.substring(0, strBuilder.length() - 1).contains("" + strBuilder.charAt(strBuilder.length() - 1))) {
            strBuilder.deleteCharAt(strBuilder.length() - 1);
        }

        return strBuilder.toString();
    }

But it doesn't work for all the test cases. It does work for the example given in this question. But when I submitted the code, it failed for 2 test cases. No I don't know the test cases for which it failed.

Even after trying various sample inputs, I couldn't find a test case for which it fails. Can someone take a look as to what is wrong with the code? I would really appreciate if someone can give a better algorithm (Just in pseudo-code). I know this is really not the optimized solution though.

Community
  • 1
  • 1
Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
  • Could there be duplicates? (ie str2=`rr` needs to match `roar` not `r` twice?) – Joachim Isaksson Apr 22 '14 at 20:09
  • You could use a regex to make the windows, and sort the windows so you could use another regex to check for using all of str2. – Scott Hunter Apr 22 '14 at 20:10
  • We can assume that `str2` and `str3` contains unique characters. But `str1` can contain duplicate of course. – Rohit Jain Apr 22 '14 at 20:11
  • Your test case fails when the third string has more than one character at least from the test cases I've tried so far, which are: "sprta", "sprt", "qa" (string index out of bounds exception) and "sqpqrqtq", "sprt", "a" – Mdev Apr 22 '14 at 20:11
  • @Human I don't get any such exception. In fact for some of my test cases, I do have more than 1 character in `str3`. – Rohit Jain Apr 22 '14 at 20:15
  • Perhaps I'm formatting it incorrectly, although I basically just copied and pasted your code into a new class and printed out the result from those three strings as inputs. I get the same result on the test case in the post, though. Almost everything else gives me a blank result. – Mdev Apr 22 '14 at 20:16
  • 'sqpqrqtq' 'sprt' 'a' should work as inputs though, right? – Mdev Apr 22 '14 at 20:18
  • 1
    @Human Ok, your first test case is giving blank output, rather than `sprt`. Let me check my algo – Rohit Jain Apr 22 '14 at 20:18
  • @Human Ok, found one issue. The length check in `if` condition is faulty. Should be `(window.getEnd() - window.getStart()) < str2.length()` – Rohit Jain Apr 22 '14 at 20:20
  • 5
    Unrelated to the test failures, but to perhaps simplify the code (if all is alphanumerics or at least chars that don't mess up a simple regex), you _could_ do `str1.split("["+str3+"]")` and work on the resulting strings that no longer contain invalid characters you need to handle. – Joachim Isaksson Apr 22 '14 at 20:21
  • @JoachimIsaksson Good idea. But since it was an interview question online, with exection time limit, I didn't even think of using regex. – Rohit Jain Apr 22 '14 at 20:22
  • 1
    @JoachimIsaksson And now I'm realising that how easy the solution could have been, had I used `splitting` :( – Rohit Jain Apr 22 '14 at 20:24
  • 1
    @JoachimIsaksson you should post that as an answer. – Luiggi Mendoza Apr 22 '14 at 20:26
  • Dude that code is seriously over-engineered. – Niklas B. Apr 23 '14 at 01:55
  • @RohitJain Check this [link](http://stackoverflow.com/questions/23219752/find-minimal-substring-from-an-input-string-containing-all-chars-from-2nd-string). Hope you will get a working code too. Also, suggest for better solutions. – 53by97 Apr 23 '14 at 02:18
  • @RohitJain Is there something you don't understand in my answer? Please accept the answer if you feel it answers your question or please tell how I can answer your question in a better way. – Nikunj Banka Apr 28 '14 at 13:35

3 Answers3

2

Here is working Java code tested on various test cases.

The algorithm basically uses a sliding window to examine different windows within which an answer may lie. Each character in the string str2 is analyzed at most twice. Thus the running time of the algorithm is linear, ie O(N) in the lengths of the three strings. This is infact the most optimal solution for this problem.

String str1 = "spqrstrupvqw";
String str2 = "sprt";
String str3 = "q";
char[] arr = str1.toCharArray();
HashSet<Character> take = new HashSet<Character>();
HashSet<Character> notTake = new HashSet<Character>();
HashMap<Character, Integer> map = new HashMap<Character, Integer>();

void run()throws java.lang.Exception{
    System.out.println(str1 + " " + str2 + " " + str3);

    //Add chars of str2 to a set to check if a char has to be taken in O(1)time.
    for(int i=0; i<str2.length(); i++){
        take.add(str2.charAt(i));
    }

    //Add chars of str3 to a set to check if a char shouldn't be taken in O(1) time.
    for(int i=0; i<str3.length(); i++){
        notTake.add(str3.charAt(i));
    }

    int last = -1;
    int bestStart = -1;
    int bestLength = arr.length+1;

    // The window will be from [last....next]

    for(int next=last+1; next<arr.length; next++){
        if(notTake.contains(arr[next])){ 
            last = initLast(next+1); //reinitialize the window's start. 
            next = last;
        }else if(take.contains(arr[next])){
            // take this character in the window and update count in map.
            if(last == -1){
                last = next;
                map.put(arr[last], 1);
            }else{
                if(!map.containsKey(arr[next])) map.put(arr[next], 1);
                else          map.put(arr[next], map.get(arr[next])+1);
            }
        }

        if(last >= arr.length){ // If window is invalid
            break;
        }

       if(last==-1){
            continue;
        }

        //shorten window by removing chars from start that are already present.
        while(last <= next){
            char begin = arr[last];

            // character is not needed in the window, ie not in set "take"
            if(!map.containsKey(begin)){
                last++;
                continue;
            }

            // if this character already occurs in a later part of the window
            if(map.get(begin) > 1){
                last++;
                map.put(begin, map.get(begin)-1);
            }else{
                break;
            }
        }

        // if all chars of str2 are in window and no char of str3 in window, 
// then update bestAnswer
        if(map.size() == str2.length()){
            int curLength = next - last + 1;
            if(curLength < bestLength){
                bestLength = curLength;
                bestStart  = last;
            }
        }
    }

    if(bestStart==-1){
        System.out.println("there is no such window");
    }else{
        System.out.println("the window is from " + bestStart + " to " + (bestStart + bestLength-1));
        System.out.println("window " + str1.substring(bestStart, bestStart+bestLength));
    }

}

// Returns the first position in arr starting from index 'fromIndex'
// such that the character at that position is in str2.
int initLast(int fromIndex){

    // clear previous mappings as we are starting a new window
    map.clear();
    for(int last=fromIndex; last<arr.length; last++){
        if(take.contains(arr[last])){
            map.put(arr[last], 1);
            return last;
        }
    }
    return arr.length;
}

Moreover, your code fails on many trivial test cases. One of them is when str1 = "abc", str2 = "ab", str3 = "c".

PS. If you are having a hard time understanding this code, first try reading this easier post which is very similar to the problem that has been asked.

Nikunj Banka
  • 11,117
  • 16
  • 74
  • 112
2
str1 = "spqrstrupvqw"
str2 = "sprt"
str3 = "q"

We're looking for the minimum sub-string from str1 that contain all str2 characters (assume ordered) and no characters from str3 ..

i = 1 .. str1.length
cursor = 1 .. str2.length

The solution must be on the form:

str2.first X X .. X X str2.last

So to check for that sub-string we use a cursor over str2, but we also have the constraint of avoiding str3 characters, so we have:

if str3.contain(str1[i])
    cursor = 1
else
    if str1[i] == str2[cursor]
        cursor++

Goal check is:

if cursor > str2.length
    return solution
else
    if i >= str1.length
        return not-found

And for optimization, you can skip to the next look-ahead which is:

look-ahead = { str2[cursor] or { X | X in str3 }}

In case str2 is not ordered:

i = 1 .. str1.length
lookup = { X | X in str2 }

The solution must be on the form:

str2[x] X X .. X X str2[x]

So to check for that sub-string we use a check-list str2, but we also have the constraint of avoiding str3 characters, so we have:

if str3.contain(str1[i])
    lookup = { X | X in str2 }
else
    if lookup.contain(str1[i])
        lookup.remove(str1[i])

Goal check is:

if lookup is empty
    return solution
else
    if i >= str1.length
        return not-found

And for optimization, you can skip to the next look-ahead which is:

look-ahead = {{ X | X in lookup } or { X | X in str3 }}

Code

class Solution
{
    private static ArrayList<Character> getCharList (String str)
    {
        return Arrays.asList(str.getCharArray());
    }

    private static void findFirst (String a, String b, String c)
    {
        int cursor = 0;
        int start = -1;
        int end = -1;

        ArrayList<Character> stream = getCharList(a);
        ArrayList<Character> lookup = getCharList(b);
        ArrayList<Character> avoid = getCharList(c);

        for(Character ch : stream)
        {
            if (avoid.contains(ch))
            {
                lookup = getCharList(b);
                start = -1;
                end = -1;
            }
            else
            {
                if (lookup.contains(ch))
                {
                    lookup.remove(ch)

                    if (start == -1) start = cursor;

                    end = cursor;
                }
            }

            if (lookup.isEmpty())
                break;

            cursor++;
        }

        if (lookup.isEmpty())
        {
            System.out.println(" found at ("+start+":"+end+") ");
        }
        else
        {
            System.out.println(" not found ");
        }
    }
}
Khaled.K
  • 5,828
  • 1
  • 33
  • 51
  • Characters of `str2` need not be in the same order in `str3`. so the solution will not be of the form - `str2.first X X .. X X str2.last`. See the example in the question. The answer there is `strup`, which are not in order. – Rohit Jain Apr 23 '14 at 09:06
  • Yeah, sorry for that. Added it now in the question. – Rohit Jain Apr 23 '14 at 09:21
1

What about using a regular expression?

String regex = ".*((?=[^q]*s)(?=[^q]*p)(?=[^q]*r)(?=[^q]*t)[sprt][^q]+([sprt])(?<!ss|pp|rr|tt))";

Matcher m = Pattern.compile(regex).matcher("spqrstrupvqw");

while (m.find()) {
    System.out.println(m.group(1));
}

This prints out:

strup

This can also be wrapped in a method which generates dynamically the regular expression for variable inputs:

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class MatchString {
    public static void main(String[] args) {
        System.out.println(getMinimalSubstrings("spqrstrupvqw", "sprt", "q"));
        System.out.println(getMinimalSubstrings("A question should go inside quotations.", "qtu", "op"));
        System.out.println(getMinimalSubstrings("agfbciuybfac", "abc", "xy"));
    }

    private static List<String> getMinimalSubstrings(String input, String mandatoryChars, String exceptChars) {
        List<String> list = new ArrayList<String>();
        String regex = buildRegEx(mandatoryChars, exceptChars);

        Matcher m = Pattern.compile(regex).matcher(input);

        while (m.find()) {
            list.add(m.group(1));
        }

        return list;
    }

    private static String buildRegEx(String mandatoryChars, String exceptChars) {
        char[] mandChars = mandatoryChars.toCharArray();
        StringBuilder regex = new StringBuilder("[^").append(exceptChars).append("]*(");

        for (char c : mandChars) {
            regex.append("(?=[^").append(exceptChars).append("]*").append(c).append(")");
        }

        regex.append("[").append(mandatoryChars).append("][^").append(exceptChars).append("]+([").append(mandatoryChars).append("])(?<!");

        for (int i = 0; i < mandChars.length; i++) {
            if (i > 0) {
                regex.append("|");
            }

            regex.append(mandChars[i]).append(mandChars[i]);
        }

        regex.append("))");

        return regex.toString();
    }
}

This prints out:

[strup]
[quest]
[agfbc, bfac]
bobbel
  • 3,327
  • 2
  • 26
  • 43