How do I pad all punctuation with a whitespace for every row of text in a pandas dataframe?

Question

I have a data frame with df['text'].

A sample value of df['text'] could be:

"The quick red.fox jumped over.the lazy brown, dog."

I want the output to be:

"The quick red . fox jumped over . the lazy brown , dog . "

I've tried using the str.replace() method, but I don't quite understand how to make it do what I'm looking for.

import pandas as pd

# read csv into dataframe
df=pd.read_csv('./data.csv')

#add a space before and after every punctuation
df['text'] = df['text'].str.replace('.',' . ')
df['text'].head()

# write dataframe to csv
df.to_csv('data.csv', index=False)

BENY · Answer 1 · 2019-07-27T16:42:09.267

1

Try with

df['text'] = df['text'].replace({'.':' . ',', ':' , '},regex=True)

edited Jul 27 '19 at 16:42

answered Jul 27 '19 at 16:31

BENY

317,841
20
164
234

score 1 · Answer 2 · answered Jul 27 '19 at 16:33

1

You have to use the escape operator to literally match a point, using .str.replace

df['Text'].str.replace('\.', ' . ').str.replace(',', ' , ')

0    The quick red . fox jumped over . the lazy brown ,  dog . 
Name: Text, dtype: object

answered Jul 27 '19 at 16:33

Erfan

40,971
8
66
78

score 1 · Answer 3 · answered Jul 27 '19 at 16:41

1

For replace all punctuation use regex from this with \\1 for add spaces before and after values:

df['text'] = df['text'].str.replace(r'([^\w\s]+)', ' \\1 ')

answered Jul 27 '19 at 16:41

jezrael

822,522
95
1,334
1,252

How do I pad all punctuation with a whitespace for every row of text in a pandas dataframe?

3 Answers3