1

I am trying to solve the problem of finding the longest substring without a repeating character from a given string.

class Solution(object):
    def lengthOfLongestSubstring(self, s):
        """
        :type s: str
        :rtype: int
        """
        start = 0
        mlen = -1
        cur_ = {}
        cur = 0

        while(start<len(s) and cur<len(s)):
            if s[cur] in cur_ and cur_[s[cur]] >= start:
                if cur - start > mlen:
                    mlen = cur - start
                start = cur_[s[cur]] + 1    
                cur_[s[cur]] = cur            
            else:
                cur_[s[cur]] = cur
                if cur - start > mlen:
                    mlen = cur - start
            cur = cur + 1            
        return mlen

x = Solution()
print(x.lengthOfLongestSubstring("abababcdef"))

I think I am solving it correctly:

Start a new substring when you encounter a repeated character.

But I am not getting the answers right?

In the above example the output is 5 whereas the correct answer is 6.

But for this case:

print(x.lengthOfLongestSubstring("ababa"))

the output is correct i.e. 2.

Not sure why am I failing that case? Thanks.

mourinho
  • 763
  • 6
  • 13
  • 24

3 Answers3

2

I've changed your function a bit to return the longest substring of unique characters instead of just length of it. If you want length - you can always get that from string.

def get_longest_substring(input):
    """
    :type input: str
    :rtype: str
    """

    current = []
    all_substrings = []
    for c in input:
        if c in current:
            all_substrings.append(''.join(current))
            cut_off = current.index(c) + 1
            current = current[cut_off:]
        current += c
    all_substrings.append(''.join(current))

    longest = max(all_substrings, key=len)
    return longest

longest = get_longest_substring("abababcdefc")
print(longest, len(longest))

Code goes through each char building a char array.

If it finds a char already in the array it keeps a copy of the array, cuts off beginning of it up to that character and keeps building it.

At the end it picks longest substring it found and returns it.

Justinas Marozas
  • 2,482
  • 1
  • 17
  • 37
1

You are updating mlen incorrectly in the else branch, you forgot to add current character. Also, you don't need to update mlen when you meet a repetition:

if s[cur] in cur_ and cur_[s[cur]] >= start:
    start = cur_[s[cur]] + 1    
else:
    mlen = max(mlen, cur - start + 1)

cur_[s[cur]] = cur
cur = cur + 1             
DAle
  • 8,990
  • 2
  • 26
  • 45
  • thx, could you explain a bit how you went about writing the algorithm.. why do it this way... i really want to understand and learn, thank you. – mourinho Jan 19 '18 at 14:29
1

I can suggest you this simple algorithm :

1. set all variables to empty.

2. for each letter ch in the string :

2.1. check if ch exists in the dict of the currrent found sub string ?

if it does - check if the cur' sub string longer then the max (maxSubStr initialyzed to "") ?, does - seve the cur' sub string in the max. set the dict of the currrent found sub string with the value ch, and set the current sub string to ch.

if it doen't - add ch to the dict of the currrent found sub string. and concat the cur' substring with ch.

3. return the length of the longest of the current substring and the max.

class Solution(object):

def lengthOfLongestSubstring(self, s):
    """
    :type s: str
    :rtype: int
    """

    curSubStr = ""
    curSubStrDict = {}
    maxSubStr = ""
    for ch in s :
        if ch in curSubStrDict :
            if len(maxSubStr) < len(curSubStr):
                maxSubStr = curSubStr
            curSubStrDict = {}
            curSubStrDict[ch] = ch
            curSubStr = ""+ch

        else :
            curSubStrDict[ch] = ch
            curSubStr += ch

    return len(curSubStr) if len(curSubStr) > len(maxSubStr) else len(maxSubStr)

x = Solution()
print(x.lengthOfLongestSubstring("abcaabccdefgfgh")) # 5 = |cdefg|
print(x.lengthOfLongestSubstring("abababcdef")) # 6 = |abcdef|

just like finding max element in an array, we "iterate over" (not actually iterate) the -substrings without repeating char- and saving the longest.

the iteration happens when we detect a char that contained in the current substring. than we iterate to the next substring.

Community
  • 1
  • 1
Ofir G
  • 736
  • 6
  • 18