0

I have commas in a column which I want to remove using regex.This link shows how to do so. The problem is I am getting this error in the image. The documenation says it must be a string, which mine is as you can see in the dtypes. If this is True then to_replace must be a string. Why I am I still getting this error? Thanks! How to remove commas from ALL the column in pandas at once

enter image description here

Greg
  • 31
  • 2
  • 8

2 Answers2

0

Your current syntax for calling replace on the entire data frame looks correct to me. The problem may be that the count column is numeric, and hence it makes no sense to be calling replace on it. Try calling replace only on the tags column:

count_df["tags"] = count_df["tags"].str.replace(',', '')
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • 1
    Thanks Tim. I tried that but got TypeError: 'Column' object is not callable. I tried to approach with spark and this worked. from pyspark.sql.functions import udf, concat, col, lit import re commaRep = udf(lambda x: re.sub(',$|^,','', x)) count_df_2=count_df.withColumn('tags',commaRep('tags')) count_df_2.show(3) – Greg Nov 20 '21 at 02:34
0
from pyspark.sql.functions import udf, concat, col, lit
import re

commaRep = udf(lambda x: re.sub(',$|^,','', x))
count_df_2=count_df.withColumn('tags',commaRep('tags'))
count_df_2.show(3)
sakeesh
  • 919
  • 1
  • 10
  • 24
Greg
  • 31
  • 2
  • 8