0

Which is the best way of creating multiples dictionaries from a pandas dataframe based on columns values?

My dataframe has this format:

    evtnum    pcode   energy
1   1         a       20.0
2   1         a       30.0
3   1         b       29.0
4   1         a       34.0
5   2         c       20.0
6   2         a       15.0
7   3         a        3.0
8   3         b        2.0 
9   3         c       25.0
10  4         h       28.0
11  5         a       43.6
12  5         c       20.3

evtnum takes values from 1 to 5000 and pcode are 25 different letters. I have a set with these letters:

pcode_set = [a,b,c,d,h,...]

So, I want to obtain evtnum dictionaries of lenght(pcode_set) each one, counting the ocurrencies of each letter in each event and the mean value of the energy of this letter in this event. Something like this:

 dict_1 = {a : [timesthat"a"appears in evtnum1, 
                energy mean value of a in evtnum1], 
           b : [timesthat"b"appears in evtnum1, 
                energy mean value of b in evtnum1]  
          ...
          }

 dict_2 = {a : [timesthat"a"appears in evtnum2, 
                energy mean value of a in evtnum2], 
           b : [timesthat"b"appears in evtnum2, 
                energy mean value of b in evtnum2]  
          ...
          }
...

 dict_5000 = {a : [timesthat"a"appears in evtnum5000, 
                energy mean value of a in evtnum5000], 
              b : [timesthat"b"appears in evtnum5000, 
                energy mean value of b in evtnum5000]  
             ...
          }

Please dont answer me how to count the letter's ocurrencies or how to calculate the mean value, these were just examples. I only want to know How can I create a multiple number of dictionaries and fill them taking into account the columns values of the dataframe.

Laura
  • 1,192
  • 2
  • 18
  • 36

1 Answers1

1

Using your example, this script should do the trick:

thismodule = sys.modules[__name__]

df1 = df.groupby(['evtnum', 'pcode']).agg({'pcode':'size', 'energy':'mean'}).rename(columns={'pcode': 'num_pcode',
                                                                                             'energy':'mean_energy'}).reset_index(drop = False)

for evt in df1.evtnum.unique():
    name = 'dict_'+str(evt)
    df_ = df1
    df_ = df_[df_.evtnum==evt].drop('evtnum', 1).set_index('pcode').to_dict('index')
    setattr(thismodule, name, df_)

for number in range(max(df1.reset_index().evtnum.unique())):
    print( number+1)
    print(eval('dict_'+str(number+1)))

Prints this:

1
{'a': {'num_pcode': 3, 'mean_energy': 28.0}, 'b': {'num_pcode': 1, 'mean_energy': 29.0}}
2
{'a': {'num_pcode': 1, 'mean_energy': 15.0}, 'c': {'num_pcode': 1, 'mean_energy': 20.0}}
3
{'a': {'num_pcode': 1, 'mean_energy': 3.0}, 'b': {'num_pcode': 1, 'mean_energy': 2.0}, 'c': {'num_pcode': 1, 'mean_energy': 25.0}}
4
{'h': {'num_pcode': 1, 'mean_energy': 28.0}}
5
{'a': {'num_pcode': 1, 'mean_energy': 43.6}, 'c': {'num_pcode': 1, 'mean_energy': 20.3}}
Jorge
  • 2,181
  • 1
  • 19
  • 30
  • Amazing! I definitely wouldn't have thought of doing this. Thank you so much. – Laura Feb 14 '19 at 02:00
  • Hi @Jorge, a question: then I want to sum values from different dictionaries having the same key, so I am using from collections import Counter and then for example: col_1 = Counter(eval(df_1)), col_2 = Counter(eval(df_2)). If then I try: sum_dicts = col_1 + col_2 I get this error: newcount = count + other[elem] TypeError: unsupported operand type(s) for +: 'dict' and 'dict'. Do you know why is that and how to solve it? thanks again – Laura Feb 16 '19 at 19:57
  • @Laura I am not sure what you are trying to achieve. It seems that what you need can be achieve using groupby on the original dataframe without the need to go to the dictionaries. Have you tried that? – Jorge Feb 16 '19 at 21:33
  • I didn't try it, being that I have to add more information when I create the dictionaries. Here I post an update of my question with the new problem: https://stackoverflow.com/questions/54727463/sum-values-of-dynamically-created-dictionaries-using-counter-from-collections thanks! – Laura Feb 16 '19 at 21:38