0

Please let me know

  1. what's the best way to clean CSV and load to mysql

I am working on loading a couple of different CSVs to mysql database but the CSVs have some anomalies. note: using pandas read_csv for loading to df and to_sql to load to mysql

I am trying to remove all characters like from csv, getting data into dataframe with pd.read_csv and within the dataframe trying to do df[col].replace('$','').. does not work on some values unable to find out why. There is no error as such but simply does not remove these characters.

Also the intention is to remove these special characters so accurate data types can be found using SQLalchemy function below.

for col in df.columns:
df[col]=(df[col].replace('$',''))
df[col]=(df[col].replace(',',''))

For finding datatype I am using SQL Alchemy as per below: pandas to_sql all columns as nvarchar

chandana
  • 1
  • 2

1 Answers1

0

for a string column, you should use .str try

df[col]=df[col].str.replace('$','')
Sreekiran A R
  • 3,123
  • 2
  • 20
  • 41
  • If i use str.replace i get below error Attribute Error: Can only use .str accessor with string values, which use np.object_ dtype in pandas – chandana Aug 09 '18 at 10:53
  • what is the datatype of the column? you can do .str only on columns which contains string which is considered as np.object_dtype in pandas. just check the dtype of column and if its string, then only proceed to the loop using an if condition. – Sreekiran A R Aug 09 '18 at 11:59