I'm facing this issue:
I need to remove duplications from the beginning of each word of a text, but only if all words in the text are duplicated. (And capitalized after)
Examples:
text = str("Thethe cacar isis momoving vvery fasfast")
So this text should be treated and printed as:
output:
"The car is moving very fast"
I got these to treat the text:
phrase = str("Thethe cacar isis momoving vvery fasfast")
phrase_up = phrase.upper()
text = re.sub(r'(.+?)\1+', r'\1', phrase_up)
text_cap = text.capitalize()
"The car is moving very fast"
Or:
def remove_duplicates(word):
unique_letters = set(word)
sorted_letters = sorted(unique_letters, key=word.index)
return ''.join(sorted_letters)
words = phrase.split(' ')
new_phrase = ' '.join(remove_duplicates(word) for word in words)
What I can't work it out, is HOW to determine if a text needs this treatment. Because if we get a text such as:
"This meme is funny, said Barbara"
Where even though "meme" and "Barbara" (ar - ar) are repeating substrings, not all are, so this text shouldn't be treated.
Any pointers here?