0

Trying to capture multiple years of daily updated 2-D tables. I can download them to a dictionary of dataframes. Trying to write it to a CSV file, so I do not have to download it every time.

import csv
import pandas as pd 

def saver(dictex):
    for key, val in dictex.items():
        val.to_csv("data_{}.csv".format(str(key)))

    with open("keys.txt", "w") as f: #saving keys to file
        f.write(str(list(dictex.keys()))

def loader():
    """Reading data from keys"""
    with open("keys.txt", "r") as f:
        keys = eval(f.read())
    dictex = {}    
    for key in keys:
        dictex[key] = pd.read_csv("data_{}.csv".format(str(key)))

    return dictex

dictex = loader()

It can save all the keys and values in different files. My next step is to put all the data in one file.

I tried the following method, but it seems to only work with 1d dictionary. As it cannot read back with the following error message.

"ValueError: dictionary update sequence element #1 has length 0; 2 is required"

with open('datadict.csv', 'w', encoding='utf-8-sig') as csv_file:
    writer = csv.writer(csv_file)
    for key, value in data.items():
        writer.writerow([key, value])
with open('datadict.csv', encoding='utf-8-sig') as csv_file:
    reader = csv.reader(csv_file)
    mydict = dict(reader)

Here is a hand-made data set similar to what I am working with. I would like to wirte dictdf to a csv and read it back with the same structure.

import pandas as pd
import numpy as np
dates = pd.date_range('1/1/2000', periods=8)
df1 = pd.DataFrame(np.random.randn(8, 4),
index=dates, columns=['A', 'B', 'C', 'D'])

dates2 = pd.date_range('1/1/2000', periods=8)
df2 = pd.DataFrame(np.random.randn(8, 4),
index=dates, columns=['A', 'B', 'C', 'D'])

dictdf={}
dictdf['xxset']=df1
dictdf['yyset']=df2

Thanks for your attention.

chengtah
  • 11
  • 3
  • Do you want all the csv's joined in 1 big dataframe? Is the structure (# and names of columns) identical in all csv's? – Niels Henkens Mar 05 '19 at 15:33
  • No, not a big data frame, I would like to put them in a big dictionary of dataframes, if possible. Their names and columns are the same. It looks fine in data[date] before I write it in. – chengtah Mar 05 '19 at 15:45
  • What is `data[date]` in your code? I don't see any `data` object. – Niels Henkens Mar 05 '19 at 15:47
  • The value is a 957x16 dataframe each day using date as the key. I am trying to using two days of data to test it. – chengtah Mar 05 '19 at 16:03
  • So `data` is the same as your `dictex` variable in the code above? – Niels Henkens Mar 05 '19 at 16:08
  • import pandas as pd import numpy as np dates = pd.date_range('1/1/2000', periods=8) df1 = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D']) dates2 = pd.date_range('1/1/2000', periods=8) df2 = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D']) dictdf={} dictdf['xxset']=df1 dictdf['yyset']=df2 – chengtah Mar 06 '19 at 12:01

1 Answers1

1

I don't know what the exact structure of your keys.txt is or your csv's, but based on your code, I'd suspect something like this to join all csv's into one DataFrame.

import pandas as pd

"""Reading data from keys"""
with open("keys.txt", "r") as f:
    keys = eval(f.read())
list_of_dfs = []

# Read in all csv files and append to list
for key in keys:
    list_of_dfs.append(pd.read_csv("data_{}.csv".format(str(key)))) # based on your example

# Join all dataframes into 1 big one
big_df = pd.concat(list_of_dfs)

EDIT

If you want to save the dictionary with the dataframes to 1 file, saving it as a pickle file might be a better option. See this question .

Niels Henkens
  • 2,553
  • 1
  • 12
  • 27
  • The intention was to put a dictionary of dataframes into a CSV file. And I would like to read it back with the same data structure so I can continue the operation later on. I have added a sample data set for testing. – chengtah Mar 06 '19 at 12:12
  • So you want to write the dictionary WITH the data from the dataframe's into 1 csv-file? That is not possible. A dictionary with dataframe's as values doesn't match the 2D format of a csv-file. If you really want to keep the dictionary format with underlying dataframe's, saving it as a pickle file might be a better way. – Niels Henkens Mar 06 '19 at 21:09
  • Thanks. Save it as a pickle file is an alternative. – chengtah Mar 07 '19 at 08:30