0

I have a long string from which I want to read the substrings and check how many times they have occurred.

The substring count taken from the user should be used for checking the frequency.

For example:

S = "ABCDEFGHIJKLMNOPQRSTUVABCSDLSFKJJKLOP"
substringCount = 3

def foo(S):
    pass

The function should return a dictionary that looks like this,

{'ABC':2,'DEF':1,'GHI':1,'JKL':2,'MNO':1,'PQR':1 and so on...}

The length of each key is 3 as defined earlier, which can be user-defined and any number.

How do you write such a function? What is the logic for this?

Parzival
  • 332
  • 1
  • 3
  • 13
  • 2
    What have you tried so far? Show us your code! – Klaus D. Jan 05 '22 at 09:43
  • @KlausD. I haven't as I can't think of the logic and how to proceed – Parzival Jan 05 '22 at 09:48
  • 2
    I disagree with this being closed as a duplicate of either of those threads this isn't just splitting as the Op wants to then count the grouped substrings. For the Op I'd use something like this. First split the string into groups ```b = [S[i:i+substringCount] for i in range(0, len(S), substringCount)]``` then you can use that result within an interator to determine the count of each substring - ```dict((x, S.count(x)) for x in set(b))``` – Steve Mapes Jan 05 '22 at 10:07
  • Thank you, you are right. That thread doesn't answer this question – Parzival Jan 05 '22 at 10:14
  • Can the keys of the dictionary that is supposed to be returned overlap ? In the example you mentioned, you put {'ABC':2,'DEF':1...}. Can't we have {'ABC':2,'BCD':...} ? – Bill Jan 05 '22 at 16:39

1 Answers1

2

I'd probably do it through recursion, something along the line of


s = "ABCDEFGHIJKLMNOPQRSTUVABCSDLSFKJJKLOP"

userInput = int(input("Enter substring count: "))


def get_substring_count(s, userInput, res=None):
    if res is None:
        res = {}
    if len(s) == 0 or len(s) < userInput:
        return res
    tmp_s = s[:userInput]
    if tmp_s in res:
        res[tmp_s] += 1
    else:
        res[tmp_s] = 1
    return get_substring_count(s[1:], userInput, res)
Asger Weirsøe
  • 350
  • 1
  • 11