29
code: df['review'].head()
        index         review
output: 0      These flannel wipes are OK, but in my opinion

I want to remove punctuations from the column of the dataframe and create a new column.

code: import string 
      def remove_punctuations(text):
          return text.translate(None,string.punctuation)

      df["new_column"] = df['review'].apply(remove_punctuations)

Error:
  return text.translate(None,string.punctuation)
  AttributeError: 'float' object has no attribute 'translate'

I am using python 2.7. Any suggestions would be helpful.

cs95
  • 379,657
  • 97
  • 704
  • 746
data_person
  • 4,194
  • 7
  • 40
  • 75

3 Answers3

65

Using Pandas str.replace and regex:

df["new_column"] = df['review'].str.replace('[^\w\s]','')
nalzok
  • 14,965
  • 21
  • 72
  • 139
Bob Haffner
  • 8,235
  • 1
  • 36
  • 43
28

You can build a regex using the string module's punctuation list:

df['review'].str.replace('[{}]'.format(string.punctuation), '')
David C
  • 7,204
  • 5
  • 46
  • 65
13

I solved the problem by looping through the string.punctuation

def remove_punctuations(text):
    for punctuation in string.punctuation:
        text = text.replace(punctuation, '')
    return text

You can call the function the same way you did and It should work.

df["new_column"] = df['review'].apply(remove_punctuations)
Arthur Gouveia
  • 734
  • 4
  • 12