28

How do I remove multiple spaces between two strings in python.

e.g:-

"Bertug 'here multiple blanks' Mete" => "Bertug        Mete"

to

"Bertug Mete" 

Input is read from an .xls file. I have tried using split() but it doesn't seem to work as expected.

import pandas as pd , string , re

dataFrame = pd.read_excel("C:\\Users\\Bertug\\Desktop\\example.xlsx")

#names1 =  ''.join(dataFrame.Name.to_string().split()) 

print(type(dataFrame.Name))

#print(dataFrame.Name.str.split())

Let me know where I'm doing wrong.

sai
  • 434
  • 5
  • 13
Bertug
  • 915
  • 2
  • 10
  • 26

1 Answers1

54

I think use replace:

df.Name = df.Name.replace(r'\s+', ' ', regex=True)

Sample:

df = pd.DataFrame({'Name':['Bertug     Mete','a','Joe    Black']})
print (df)
              Name
0  Bertug     Mete
1                a
2     Joe    Black

df.Name = df.Name.replace(r'\s+', ' ', regex=True)
#similar solution
#df.Name = df.Name.str.replace(r'\s+', ' ')
print (df)
          Name
0  Bertug Mete
1            a
2    Joe Black
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • I'm using Python 3 and this didn't work for me. Perhaps the solution is Python 2. This solution worked for me https://stackoverflow.com/a/5658439/2763005 – Allan Tsai Feb 01 '20 at 13:22
  • 5
    Using Python 3, df["Name"].str.replace('\s+', ' ', regex=True) worked. Notice the additional str. – d_gnz Aug 22 '20 at 22:03
  • remember to add an r before the regex if working with pandas in python 3```r'\s+'``` – Krullmizter Mar 25 '22 at 07:13
  • 1
    use df.columns = df.columns.str.replace(r'\s+', ' ', regex=True) to strip duplicate white spaces in all the columns. – Akhil S Mar 08 '23 at 11:26