2

Suppose I have a string that looks like

s = "this is a random random this is a random Sentence sentence where phrases and words words repeat. This is the the second sentence sentence of the Same same paragraph"

I want its output to be

this is a random sentence where phrases and words repeat. This is the second sentence of the same paragraph"

This is something that I have tried, it handles the repeated words and phrases but does not take care of case sensitive duplicate words like Sentence sentence and Same same

s = "this is a random random this is a random Sentence sentence where phrases and words words repeat. This is the the second sentence sentence of the Same same paragraph"

def postprocess(s):
    while re.search(r'\b(.+)(\s+\1\b)+', s):
        s = re.sub(r'\b(.+)(\s+\1\b)+', r'\1', s)
    return s

postprocess(s)

the output it returns is this is a random this is a random Sentence sentence where phrases and words repeat. This is the second sentence of the Same same paragraph can anyone help me here?

Aayush Gupta
  • 434
  • 2
  • 13
  • 1
    I did find the answer here is goes s = "this is a random this is a random Sentence sentence sentence where phrases King king King and words words repeat. This is the the second sentence sentence of the same same paragraph. this is the third third para" def postprocess(s): if re.search(r'\b(.+)(\s+\1\b)+', s, flags=re.IGNORECASE): s = re.sub(r'\b(.+)(\s+\1\b)+', r'\1', s, flags=re.IGNORECASE) return s else: return s postprocess(s) – Aayush Gupta Aug 18 '20 at 17:56

0 Answers0