1

How to remove trailing dots from pandas series?

My attempt

import numpy as np
import pandas as pd

pd.set_option('max_colwidth',1000)

s = pd.Series(["""Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperias… http://instagram.com/p/YGEt5JC6JM/"""])


s.str.replace(r'(\w)\.+',r'\1',regex=True)

My results

Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperias… http://instagramcom/p/YGEt5JC6JM/


wanted:
Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperia http://instagramcom/p/YGEt5JC6JM/

BhishanPoudel
  • 15,974
  • 21
  • 108
  • 169

3 Answers3

3

Those aren't periods, they're the ellipsis character, which is Unicode character \u2026. See How should I write three dots?

s.str.replace(r'(\w)\u2026+',r'\1',regex=True)
Barmar
  • 741,623
  • 53
  • 500
  • 612
2

Could you please try following, written as per shown samples.

pd.set_option('max_colwidth',1000)
s = pd.Series(["""Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperias… http://instagram.com/p/YGEt5JC6JM/"""])
s.str.replace(r'…+',r'')
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
0

As per suggestion of Barmar:

s = pd.Series(["""Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperias… http://instagram.com/p/YGEt5JC6JM/"""])


s.str.replace(r'(\w)…',r'\1',regex=True)

Gives:
Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperias http://instagram.com/p/YGEt5JC6JM/
BhishanPoudel
  • 15,974
  • 21
  • 108
  • 169