0

Quite new to Python, especially dictionaries and can't find anything specific about what I am trying to do.

Essentially, I have an OrderedDict of Pandas Dataframes (a bunch of excel sheets that I read in and converted to dataframes) and I would like to individually access those dataframes, modify them and then update them in the OrderedDict, but not quite sure how to do so.

As stated, I am pretty new to this, so I know how to update the dataframe, but not how to store it back in the dictionary. Currently, my code looks like this:

for sheet in cons_excel_sheets:
    df = cons_excel_sheets[sheet]
    row = df[df['Row Labels'] == 'Grand Total'].index.tolist()[0]
    df = df.iloc[:row - 1]
    cleaned_dataframes_list.update(df)

This returns the following error (this works if I only update one dataframe at a time, i.e. without the for loop):

    Traceback (most recent call last):
  File "C:\Users\USER\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-38dbbba27fa0>", line 5, in <module>
    row = df[df['Row Labels'] == 'Grand Total'].index.tolist()[0]
IndexError: list index out of range

Not sure how to fix this error, and also doubt that I am updating the OrderedDict correctly at the end of the for loop.

Any ideas?

RenierMeyer
  • 1
  • 1
  • 3
  • kindly share ur data (dataframe, and dictionary), with ur expected output. use this as a [guide](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – sammywemmy Mar 28 '20 at 10:59

1 Answers1

0

You can try this out with some sample data:

#DataFrames generated from the excel files
value_df1 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), columns= 
            ['a', 'b', 'c'])
value_df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), columns= 
            ['a', 'b', 'c'])
value_df3 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), columns= 
            ['a', 'b', 'c'])                   

#Making a Dictionary of the Dataframes
dict = {
  "key0": value_df1,
  "key1": value_df2,
  "key2": value_df3
}

#Accessing an element in a particular DataFrame (e.g. value_df2 - column b)
tempDataFrame = dict[key1]
tempDataFrame = tempDataFrame['b']

#Iterate through dictionary and update value(s) in a DataFrame(e.g. value_df3 - setting column a values to 0)
for k,val in  dict:
    tempDataFrame = dict[k]
    tempDataframe['a'] = 0
    dict[k] = tempDataFrame

I hope this answers your question.

  • Thansk for your feedback - i get the following when trying to loop over the dictionary's keys and values Traceback (most recent call last): File "C:\Users\USER\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 18, in for k,val in dict: ValueError: too many values to unpack (expected 2) – RenierMeyer Mar 28 '20 at 10:34