I have a column in a Pandas dataframe that contains dictionaries with Python native types in it (like strings, integers, floats etc.).
After saving this dataframe in a csv file and reloading it, the values are loaded as strings and it seems to be impossible to cast them back to dictionaries. The Python string representation of a dictionary is not a JSON (single quotes instead of double quotes and boolean starts with a capital letter instead of a lower case letter) so it is not possible to use json.loads()
Consider this example :
import json
import pandas as pd
df = pd.DataFrame({'dict_column' : [{'key' : 'val'}]})
df.iloc[0, 0]['key'] --> returns 'val'
df.to_csv('file.csv', index=False)
df_loaded = pd.read_csv('file.csv')
df_loaded.iloc[0, 0]['key'] --> returns an error
json.loads(df_loaded.iloc[0, 0]) --> returns an error
Because of this, it seems very important to apply json.dumps on a dict column before saving a dataframe in a csv. But how to do for existing CSVs ? Is there anyway to cast a string representation of a dict back to a dict ? I have a huge dataframe saved like that and I don't want to lose the data in it.
I hope that there is a better solution than replacing the single quotes to double quotes in the CSV file directly.