2

I have a dataframe, each cell saves a dictionary. Before exporting the dataframe, I could call each cell as an individual dataframe.

However, after saving the dataframe as csv and reopening this each cell became string so I could not turn the cell I called into a dataframe anymore.

The output should look like this

After saving the dataframe as csv, dictionary became string

I was surprising to learn after my research on Stackoverflow, there were not many people experienced same issue as I'm having. I wondered whether my practice is wrong. I only found two posts related to my issue. Here is the one (dict objects converting to string when read from csv to dataframe pandas python).

I basically tried json, ast.literal_eval and yaml but none of these could solve my issue.

This is the first part of my code(I created this four list to store my data which I called from an api)

tickers4 = []
last_1st_bs4 = []
last_2nd_bs4 = []
last_3rd_bs4 = []

for i in range(len(tickers)):
    try:
        ticker = tickers.loc[i, 'ticker']
        ann_yr = 2018

        yr_1st = intrinio.financials_period(ticker, str(ann_yr-1), fiscal_period='FY', statement='balance_sheet')
        yr_2nd = intrinio.financials_period(ticker, str(ann_yr-2), fiscal_period='FY', statement='balance_sheet')
        yr_3rd = intrinio.financials_period(ticker, str(ann_yr-3), fiscal_period='FY', statement='balance_sheet')

        tickers4.append(ticker)
        last_1st_bs4.append(yr_1st)
        last_2nd_bs4.append(yr_2nd)
        last_3rd_bs4.append(yr_3rd)

        print('{} Feeding data {}'.format(i, ticker))

    except:
        tickers4.append(ticker)
        last_1st_bs4.append(0)
        last_2nd_bs4.append(0)
        last_3rd_bs4.append(0)
        print('{} Error {}'.format(i, ticker))

Second part: I put them into a dataframe and saved as csv

BS = pd.DataFrame()
BS['ticker'] = tickers4
BS['BS_2017'] = last_1st_bs4
BS['BS_2016'] = last_2nd_bs4
BS['BS_2015'] = last_3rd_bs4
BS.to_csv('Balance_Sheet_2015_2017.csv')

now, I need read this csv in another notebook

BS = pd.read_csv('./Balance_Sheet_2015_2017.csv', index_col=0)
BS.loc[9, 'BS_2017']

here is the result I got: ' cashandequivalents shortterminvestments notereceivable \\\nyear \n2017 2.028900e+10 5.389200e+10 1.779900e+10 \n\n accountsreceivable netinventory othercurrentassets \\\nyear \n2017 1.787400e+10 4.855000e+09 1.393600e+10 \n\n totalcurrentassets netppe longterminvestments \\\nyear \n2017 1.286450e+11 3.378300e+10 1.947140e+11 \n\n othernoncurrentassets ... \\\nyear ... \n2017 1.817700e+10 ... \n\n commitmentsandcontingencies commonequity retainedearnings \\\nyear \n2017 0.0 3.586700e+10 9.833000e+10 \n\n aoci totalcommonequity totalequity \\\nyear \n2017 -150000000.0 1.340470e+11 1.340470e+11 \n\n totalequityandnoncontrollinginterests totalliabilitiesandequity \\\nyear \n2017 1.340470e+11 3.753190e+11 \n\n currentdeferredrevenue noncurrentdeferredrevenue \nyear \n2017 7.548000e+09 2.836000e+09 \n\n[1 rows x 30 columns]'

Thanks for your help.

Jon Zhao
  • 21
  • 1
  • 3

2 Answers2

2

CSV is not an appropriate format for saving dictionaries (and honestly, putting dictionaries into DataFrames isn't a great data structure). You should try writing the DataFrame to json instead: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_json.html

PMende
  • 5,171
  • 2
  • 19
  • 26
  • Thanks for providing alternative methods. Do you know if there is anyway I can convert strings back to dictionaries? it took me days to collect those data. – Jon Zhao Sep 29 '18 at 04:47
  • @JonZhao you can use json library d = json.loads(j), it will convert your string to dictionary again – Naga kiran Sep 29 '18 at 05:29
  • @NagaKiran I tried json, ast.literal_eval and yaml. however, json and eval throw a indent error. yaml still returns a string but better format. – Jon Zhao Sep 29 '18 at 13:35
  • I should know this earlier. pickle save the dataframe with original format but csv saves your dataframe into comma separated list!. I wish I read this earlier. https://stackoverflow.com/questions/48770542/what-is-the-difference-between-save-a-pandas-dataframe-to-pickle-and-to-csv – Jon Zhao Sep 29 '18 at 13:42
2

I had this same error once. I solved it by using DataFrame.to_pickle() instead of DataFrame.to_csv().

Everything in a CSV file is plain text, even the numerical values. When you load a CSV file into a spreadsheet program, there are parsers which look for strings which are recognizable as numbers, or dates, and convert them accordingly.

A CSV file can't easily hold the more complex Python objects, but Pandas won't throw an error if you place Python objects in a DataFrame. It converts them to their string representations.

John Ladasky
  • 1,016
  • 8
  • 17
  • Hi John, I guess that would work before I saved my dataframe, but my original data format is already changed and I'm trying to convert that back to dictionaries. Thanks for your solution, I'll use in the future. – Jon Zhao Sep 29 '18 at 13:33