1

I have a data frame with df['text'].

A sample value of df['text'] could be:

"The quick red.fox jumped over.the lazy brown, dog."

I want the output to be:

"The quick red . fox jumped over . the lazy brown , dog . "

I've tried using the str.replace() method, but I don't quite understand how to make it do what I'm looking for.

import pandas as pd

# read csv into dataframe
df=pd.read_csv('./data.csv')

#add a space before and after every punctuation
df['text'] = df['text'].str.replace('.',' . ')
df['text'].head()

# write dataframe to csv
df.to_csv('data.csv', index=False)
youngguv
  • 103
  • 1
  • 7

3 Answers3

1

Try with

df['text'] = df['text'].replace({'.':' . ',', ':' , '},regex=True)
BENY
  • 317,841
  • 20
  • 164
  • 234
1

You have to use the escape operator to literally match a point, using .str.replace

df['Text'].str.replace('\.', ' . ').str.replace(',', ' , ')

0    The quick red . fox jumped over . the lazy brown ,  dog . 
Name: Text, dtype: object
Erfan
  • 40,971
  • 8
  • 66
  • 78
1

For replace all punctuation use regex from this with \\1 for add spaces before and after values:

df['text'] = df['text'].str.replace(r'([^\w\s]+)', ' \\1 ')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252