Another user already opened the discussion on how to find repeated phrases in Python, but focusing only on phrases of three words.
The answer of Robert Rossney was complete and working (it is here repeated phrases in the text Python) , but can I ask for a method that simply finds repeated phrases, notwithstanding their length? I think it is possible to elaborate on the method already elaborated in the previous discussion, but I am not pretty sure on how to do it.
I think this is the function that might be modified in order to return tuples of different lenght:
def phrases(words):
phrase = []
for word in words:
phrase.append(word)
if len(phrase) > 3:
phrase.remove(phrase[0])
if len(phrase) == 3:
yield tuple(phrase)