0

So I've been trying to solve the word break Dynamic Programming problem, which basically means that given a dictionary of strings and a string, see if the words in the dictionary can be combined to form the string. For example, given word "applepenapple" and dictionary ["apple","pen"] it should return true.

I have a working java solution, but I'm trying to improve my C++ skills. My problem is even though my code looks extremely similar to the working solution in Java, I am failing a niche test case and I can't figure out why.

C++ code:

bool wordBreak(string s, vector<string> &wordDict) {
    vector<int> bArr(s.length(), -1);
    unordered_set<string> set(wordDict.begin(), wordDict.end());
    return wordBreak(s, bArr, 0, set);
}

bool wordBreak(string s, vector<int> &bArr, int start, unordered_set<string> &set) {
    if (start == s.length())
        return true;
    //If we have a memoized solution to this problem, avoid recurion
    if (bArr[start] != -1)
        return (bArr[start] == 1);
    for (int end = start + 1; end <= s.length(); end++) {
        if (set.count(s.substr(start, end)) && wordBreak(s, bArr, end, set)) {
            bArr[start] = 1;
            return bArr[start] == 1;
        }
    }
    bArr[start] = 0;
    return false;
}

Working code using java:

public boolean wordBreak(String s, List<String> wordDict) {
    Integer[] memo =new Integer[s.length()];
    Arrays.fill(memo,-1);
    return word_Break(s, new HashSet(wordDict), 0, memo);
}

public boolean word_Break(String s, Set<String> wordDict, int start, Integer[] memo) {
    if (start == s.length()) {
        return true;
    }
    if (memo[start] != -1) {
        return memo[start]==1;
    }
    for (int end = start + 1; end <= s.length(); end++) {
        if (wordDict.contains(s.substring(start, end)) && word_Break(s, wordDict, end, memo)) {
            memo[start] = 1;
            return memo[start] == 1;
        }
    }
    memo[start] = 0;
    return false;
}

The C++ code is returning false for "applepenapple" with dictionary ["apple","pen"] and I don't know why since the java return true which is correct. The only major difference (I think) between the two solutions is that my C++ uses a vector instead of a native array in the java code. Initially I thought it might have to do with C++ using automatic storage (stack) vs the free store (heap) which is why I used the vector instead of a C-style array to avoid memory management because of RAII. Despite this change, the bug persists. There is an easier solution avoiding recursion altogether, but I am very curious as to why the C++ is returning different output than the java.

Hisham Hijjawi
  • 1,803
  • 2
  • 17
  • 27
  • It appears to have a compilation error. Are you sure you're not just running an older build of the program? Some IDEs will do that. You've also got some warnings because you're doing comparisons between signed and unsigned ints and you really shouldn't ignore warnings in C++. – Chris Rollins Jan 09 '19 at 11:34
  • *The C++ code is returning false for "applepenapple" with dictionary ["apple","pen"]* -- Please post a [mcve] instead of a description. – PaulMcKenzie Jan 09 '19 at 11:50
  • It isn't a good idea to attempt to do a line-by-line translation from Java to C++ and vice-versa. You are not learning C++ by doing this, instead it is an easy way to totally get things wrong. For example `substr` in C++ is not the same as `substring` in Java. In addition (as the other comment pointed out), you are not heeding the warnings of the compiler about `unsigned`. Also, your usage of the variable name `set` is not a good idea, since there is a `std::set` in the C++ standard library, causing confusing code. – PaulMcKenzie Jan 09 '19 at 12:10

1 Answers1

4

I see a potential problem. From java.lang.String Javadoc (emphasis mine):

public String substring(int beginIndex, int endIndex)

Returns a new string that is a substring of this string. The substring begins at the specified beginIndex and extends to the character at index endIndex - 1. Thus the length of the substring is endIndex-beginIndex.

Examples:

"hamburger".substring(4, 8) returns "urge"
"smiles".substring(1, 5) returns "mile"

Parameters:

beginIndex - the beginning index, inclusive.

endIndex - the ending index, exclusive.

From cppreference.com documentation on strings:

basic_string substr( size_type pos = 0, size_type count = npos ) const;

Returns a substring [pos, pos+count). If the requested substring extends past the end of the string, or if count == npos, the returned substring is [pos, size()).

Parameters

pos - position of the first character to include

count - length of the substring

That is, in Java you should pass an index as second parameter to String.substring(...), but in C++ you should pass a length to basic_string::substr(...). However, you are doing:

s.substr(start, end)

and

s.substring(start, end)

in both cases.

Maybe adjusting the C++ invocation to

s.substr(start, end - start)

will work?

gpeche
  • 21,974
  • 5
  • 38
  • 51