0

I'm trying to remove duplicate lines with this regex that works great:

(.*+)\n*(\1\n+)* 

But when I try to use it in Python it doesn't work:

response1 = re.sub(r'(.*+)\n*', r'(\1\n+)*', response1)

Error:

Exception has occurred: re.error
multiple repeat at position 3

Am I doing something wrong?

Thank you,

Creek Barbara
  • 637
  • 1
  • 7
  • 29

1 Answers1

1

The "multiple repeat at position 3" problem is with the regex:

.*+

You can use either ".*" or ".+". Something like the following should remove consecutive duplicated lines:

response = """A
A    
A
B
B
A
A
"""
print(re.sub(r'(.*\n)(\1)+', r'\2', response))

Output

A
B
A
user2468968
  • 286
  • 3
  • 9