0

I have a string I'd like to split to new strings which will contain only text (no commas, spaces, dots etc.). The length of each new string must be of variable n. The slicing must go through each possible combination.
Meaning, for example, an input of func('banana pack', 3) will result in ['ban','ana','nan','ana',pac','ack']. So far what I managed to achieve is:

  def func(text, n):
text = text.lower()
text = text.translate(str.maketrans("", "", " .,"))
remainder = len(text) % n
split_text = [text[i:i + n] for i in range(0, len(text) - remainder, n)]
if remainder > 0:
    split_text.append(text[-n:])
print(split_text, len(split_text))
result_dict = {}
for word in split_text:
    if word not in result_dict:
        result_dict[word] = n / len(text)
    else:
        result_dict[word] = (result_dict[word] + 1) / len(text)
return result_dict

1 Answers1

1

Get rid of the disallowed characters then chunk the string:

def func(text, n):
    text = text.translate(str.maketrans("", "", " .,"))
    return [text[i:i + n] for i in range(0, len(text), n)]

If you want the last element to always have 3 characters, use this instead:

def func(text, n):
    text = text.translate(str.maketrans("", "", " .,"))
    extra = len(text) % 3
    chunks = [text[i:i + n] for i in range(0, len(text) - extra, n)]
    if extra > 0:
        chunks.append(text[-n:])
    return chunks
Aplet123
  • 33,825
  • 1
  • 29
  • 55