2

How can I work with a dictionary of dataframes please? Or, is there a better way to get an overview of my data? If I have for example:

Fruit   Qty Year
Apple   2   2016
Orange  1   2017
Mango   2   2016
Apple   9   2016
Orange  8   2015
Mango   7   2016
Apple   6   2016
Orange  5   2017
Mango   4   2015

Then I am trying to find out how many in total I get per year, for example:

        2015    2016    2017
Apple   0       11      0
Orange  8       0       6
Mango   4       9       0

I have written some code but it might not be useful:

import pandas as pd

# Fruit Data
df_1 = pd.DataFrame({'Fruit':['Apple','Orange','Mango','Apple','Orange','Mango','Apple','Orange','Mango'], 'Qty': [2,1,2,9,8,7,6,5,4], 'Year': [2016,2017,2016,2016,2015,2016,2016,2017,2015]})

# Create a list of Fruits
Fruits = df_1.Fruit.unique()

# Break down the dataframe by Year
df_2015 = df_1[df_1['Year'] == 2015]
df_2016 = df_1[df_1['Year'] == 2016]
df_2017 = df_1[df_1['Year'] == 2017]

# Create a dataframe dictionary of Fruits
Dict_2015 = {elem : pd.DataFrame for elem in Fruits}
Dict_2016 = {elem : pd.DataFrame for elem in Fruits}
Dict_2017 = {elem : pd.DataFrame for elem in Fruits}

# Store the Qty for each Fruit x each Year
for Fruit in Dict_2015.keys():
    Dict_2015[Fruit] = df_2015[:][df_2015.Fruit == Fruit]
for Fruit in Dict_2016.keys():
    Dict_2016[Fruit] = df_2016[:][df_2016.Fruit == Fruit]
for Fruit in Dict_2017.keys():
    Dict_2017[Fruit] = df_2017[:][df_2017.Fruit == Fruit]
Toto
  • 89,455
  • 62
  • 89
  • 125
R. Cox
  • 819
  • 8
  • 25

2 Answers2

3

You can use pandas.pivot_table.

res = df.pivot_table(index='Fruit', columns=['Year'], values='Qty',
                     aggfunc=np.sum, fill_value=0)

print(res)

Year    2015  2016  2017
Fruit                   
Apple      0    17     0
Mango      4     9     0
Orange     8     0     6

For guidance on usage, see How to pivot a dataframe.

jpp
  • 159,742
  • 34
  • 281
  • 339
2

jpp has already posted an answer in the format you wanted. However, since your question seemed like you are open to other views, I thought of sharing another way. Not exactly in the format you posted but this how I usually do it.

df = df.groupby(['Fruit', 'Year']).agg({'Qty': 'sum'}).reset_index()

This will look something like:

Fruit  Year  Sum
Apple  2015    0
Apple  2016   11
Apple  2017    0
Orange 2015    8
Orange 2016    0
Orange 2017    6
Mango  2015    4
Mango  2016    9
Mango  2017    0
Ankur Sinha
  • 6,473
  • 7
  • 42
  • 73