Find prefix within a string

Question

I'm currently doing a leetcode question where I have to find a prefix within a sentence and return the word number within the sentence else return -1. I came up with a solution but it crashes with some strings and i dont know why. An example of this is the following:

Input: sentence = "i love eating burger", searchWord = "burg"
Output: 4 (I also get an output of 4)
Explanation: "burg" is prefix of "burger" which is the 4th word in the sentence.

but fails this example:

Input: sentence = "this problem is an easy problem", searchWord = "pro"
Output: 2 ( I get an output of 6)
Explanation: "pro" is prefix of "problem" which is the 2nd and the 6th word in the sentence, but we return 2 as it's the minimal index.

My cout for this one produced a very weird snippet:

 problem is an easy problem
 problem is an easy problem
 problem is an easy problem
 problem is an easy problem
probl
proble
problem
problem
problem i
problem is

it completely ignored the first couple substrings when i increments, this is the only time it happens tho.

int isPrefixOfWord(string sentence, string searchWord)
{
    string sub;
    int count = 1;
    for (int i = 0; i < sentence.length(); i++)
    {
        if (sentence[i] == ' ')
            count++;
        for (int j = i; j < sentence.length(); j++)
        {
            sub = sentence.substr(i, j);
            cout<<sub<<endl;
            if (sub == searchWord)
            {
                return count;
            }
        }
    }
    return -1;
}

Any Ideas?

int isPrefixOfWord(string sentence, string searchWord)
{
    string sub;
    int count = 1;
    for (int i = 0; i < sentence.length() - searchWord.length() - 1; i++)
    {
        if (sentence[i] == ' ')
            count++;
        
        sub = sentence.substr(i,searchWord.length());
        if ( sub == searchWord && (sentence[i-1] == ' ' || i == 0))
        {
            return count;
        }
    
    }
    return -1;
}

You're using `substr` incorrectly. You should add the output of the program to your question as well. — cigien, Oct 17 '20 at 01:12
@cigien How tho, the first parameter is the initial index and the second is the length of the substring. Im checking for every length at that specific initial point. — unity123, Oct 17 '20 at 01:14
@cigien Because that cout has every single possible substring within that string. Ill see if i can cut out a bit of the results. — unity123, Oct 17 '20 at 01:19
Hmm, the dupe was wrong. You should change your loop actually. — cigien, Oct 17 '20 at 01:20
I posted part of the output, seems like theres an issue with that portion of the analysis — unity123, Oct 17 '20 at 01:22
You are using `sub` to see if it's the word you're searching for. Your output shows lots of strings you shouldn't be considering. — cigien, Oct 17 '20 at 01:29
Looks like busywork. You search substrings of all possible lengths for each `i` when you know in advance that you only have to check the *one* that starts at `sentence[i]` and consists of `searchWord.length()` characters? — JaMiT, Oct 17 '20 at 01:43

PaulMcKenzie · Answer 1 · 2020-10-17T03:57:51.543

A very simple C++20 solution using starts_with:

#include <string>
#include <sstream>
#include <iostream>

int isPrefixOfWord(std::string sentence, std::string searchWord)
{
    int count = 1;
    std::istringstream strm(sentence);
    std::string word;
    while (strm >> word)
    {
        if ( word.starts_with(searchWord) )
           return count;
        ++count;
    }
    return -1;        
}

int main()
{
    std::cout << isPrefixOfWord("i love eating burger",  "burg") << "\n";
    std::cout << isPrefixOfWord("this problem is an easy problem", "pro") << "\n";
    std::cout << isPrefixOfWord("this problem is an easy problem", "lo");
}

Output:

4
2
-1

Currently, LeetCode and many other of the online coding sites do not support C++20, thus this code will not compile successfully on those online platforms.

Therefore, here is a live example using a C++20 compiler

Doesn't really matter which stream class to use, as long as it works. I tend to be explicit and use `std::istringstream` for input and `std::ostringstream` for output. — PaulMcKenzie, Oct 17 '20 at 02:54

Elliott · Answer 2 · 2020-10-17T06:40:04.337

The bug that effects the function output is that you aren't handling the increment of i within your inner for loop:

for (int i = 0; i < sentence.length(); i++)
{
    if (sentence[i] == ' ')
        count++;

    for (int j = i; j < sentence.length(); j++)
    {
        sub = sentence.substr(i, j);
        cout<<sub<<endl;
        if (sub == searchWord)
        {
            return count;
        }
    }
}

Notice that once your inner-loop is complete that i always iterates by one. So your next search through a word will incorrectly start at its next character, which incorrectly searches for "sub-words" instead of only prefixes, and so creates false positives (and unnecessary work).

Also note that every time that you do:

(sub == searchWord)

That this checks all j characters, even though we're only interested in whether the new jth character is a match.

Another bug, which effects your performance and your couts is that you're not handling mismatches:

if (sub == searchWord)

...is never false, so the only way to exit the inner loop is to keep increments j till the end of the array, so sub ends up being large.

A way to fix your second bug is to replace your inner loop like so:

    if (sentence.substr(i, i + searchWord.length()) == searchWord)
        return count;

and finally, to fix all bugs:

int isPrefixOfWord (const string & sentence, const string & searchWord)
{
    if (sentence.length() < searchWord.length())
        return -1;

    const size_t i_max = sentence.length() - searchWord.length();

    for (size_t i = 0, count = 1; ; ++count)
    {
        // flush spaces:
        while (sentence[i] == ' ')
        {
            if (i >= i_max)
                  return -1;

            ++i;
        }

        if (sentence.substr(i, searchWord.length()) == searchWord)
            return count;
      
        // flush word:
        while (sentence[i] != ' ')
        {
            if (i >= i_max)
                  return -1;

            ++i;
        }
    }
  
    return -1;
}

Note that substr provides a copy of the object (it's not just a wrapper around a string), so this takes linear time with respect to searchWord.length(), which is particularly bad the word within sentence is smaller.

We can improve the speed by replacing

if (sentence.substr(i, searchWord.length()) == searchWord)
    return count;

...with

    for (size_t j = 0; sentence[i] == searchWord[j]; )
    {
        ++j;
   
        if (j == searchWord.size())
            return count;

        ++i;
    }

Others have shown nice applications of the libraries that help solve the problem.

If you don't have access to those libraries for your assignment, or if you just want to learn how you could modularise a problem like this without loosing efficiency, then here's a way to do it in c++11 without any libraries (except string):

bool IsSpace (char c)
{
    return c == ' ';
}

bool NotSpace (char c)
{
    return c != ' ';
}

class PrefixFind
{
    using CharChecker = bool (*)(char);

    template <CharChecker Condition>
    void FlushWhile ()
    {
        while ((m_index < sentence.size()) 
            && Condition(sentence[m_index]))
            ++m_index;
    }
    
    void FlushWhiteSpaces ()
    {
        FlushWhile<IsSpace>();
    }
        
    void FlushToNextWord ()
    {
        FlushWhile<NotSpace>();
        FlushWhile<IsSpace>();
    }
    
    bool PrefixMatch ()
    {
        // SearchOngoing() must equal `true`
    
        size_t j = 0;
    
        while (sentence[m_index] == search_prefix[j])
        {
            ++j;
            
            if (j == search_prefix.size())
                return true;
            
            ++m_index;
        }
        
        return false;
    }
    
    bool SearchOngoing () const
    {
        return m_index + search_prefix.size() <= sentence.size();
    }
    
    const std::string & sentence;
    const std::string & search_prefix;
    size_t m_index;

public:

    PrefixFind (const std::string & s, const std::string & sw)
        : sentence(s),
        search_prefix(sw)
    {}

    int FirstMatchingWord ()
    {
        const int NO_MATCHES = -1;
    
        if (!search_prefix.length())
            return NO_MATCHES;
    
        m_index = 0;
        FlushWhiteSpaces();

        for (int n = 1; SearchOngoing(); ++n)
        {
            if (PrefixMatch())
                return n;

            FlushToNextWord();
        }

        return NO_MATCHES;
    }
};

In terms of speed: If we consider the length of sentence to be m, and the length of searchWord to be n, then original (buggy) code had O(n*m^2) time complexity. But with this improvement we get O(m).

This makes sense however, it didnt pass all of the tests for some reason. Whilst waiting for someone to help, i took an attempt to try and fix it however my current solution works on 38/39 of tests but with the last one a heap buffer overflow occurs. Ill post my code at the bottom of the post. — unity123, Oct 17 '20 at 01:55
You left one bug: The opening curly brace alignment appears to be off :-) — Mark Moretto, Oct 17 '20 at 02:23
Yeah, that was a little humor. Inline braces > newline braces! Your answer is solid, otherwise! — Mark Moretto, Oct 17 '20 at 02:58
@MarkMoretto, haha. Thanks. Yeah - I knew that you weren't being serious but wasn't sure what you meant. Yes, I'm a big fan of newline formatting. =P — Elliott, Oct 17 '20 at 03:02

score 0 · Answer 3 · answered Oct 17 '20 at 02:03

We can just use std::basic_stringstream for solving this problem. This'll pass through:

// The following block might slightly improve the execution time;
// Can be removed;
static const auto __optimize__ = []() {
    std::ios::sync_with_stdio(false);
    std::cin.tie(nullptr);
    std::cout.tie(nullptr);
    return 0;
}();

// Most of headers are already included;
// Can be removed;
#include <cstdint>
#include <string>
#include <sstream>


static const struct Solution {
    static const int isPrefixOfWord(
        const std::string sentence,
        const std::string_view search_word
    ) {
        std::basic_stringstream stream_sentence(sentence);
        std::size_t index = 1;
        std::string word;

        while (stream_sentence >> word) {
            if (!word.find(search_word)) {
                return index;
            }

            ++index;
        }

        return -1;

    }
};

Find prefix within a string

3 Answers3