3

The dataframe looks like this:

d = {'ID': [1, 2,3],'V':['F','G','H'],'AAA':[0,1,1],'AA':[0,2,2],'A':[0,3,3],'BBB':[0,4,4]}
df2 = pd.DataFrame(data=d)

and dictionary looks like this:

dct ={1:{'F':[2,3,5],'G':[3,5,6],'H':[6,7,8]},
      2:{'F':[1,3,5],'G':[8,5,6],'H':[9,7,8]},
      3:{'F':[5,3,5],'G':[4,5,6],'H':[10,7,8]}
     }

Based on value of 'ID' and 'V' I can access list from dictionary i.e. dct[2]['G']. How can I apply a merge on this?

In short, I want to append particular list as a row to dataframe.

Intended result should be like following:

op_d = {'ID': [1, 2,3],'V':['F','G','H'],'AAA':[0,1,1],'AA':[0,2,2],'A':[0,3,3],'BBB':[0,4,4],'Q1':[2,8,10],'Q2':[3,5,7],'Q3':[5,6,8]}
output_df = pd.DataFrame(data=op_d )
yatu
  • 86,083
  • 12
  • 84
  • 139
Prish
  • 605
  • 7
  • 21
  • 1
    @anky_91 typo is fixed. basically I am trying to append particular list based on key to the dataframe as a row. something better than iterating over rows, fetching list by keys of 'ID' and 'V' and then merging. Trying to avoid row iteration. – Prish Jan 16 '20 at 16:09
  • What's the use case for this? – AMC Jan 16 '20 at 16:47
  • It's a part of a bigger problem where I calculate probability of default for nested parameters. see my yesterday's question: https://stackoverflow.com/questions/59758331/advice-on-data-structure-for-scaler-operation-between-dictionary-and-dataframe – Prish Jan 16 '20 at 16:49

3 Answers3

7

Use df.lookup after creating a dataframe from the dictionary dct and covert to a dataframe then assign:

m = pd.DataFrame(dct).T
s = df2.set_index('ID')['V']
output = df2.assign(**pd.DataFrame(m.lookup(s.index,s).tolist(),columns=['Q1','Q2','Q3']))

print(output)

   ID  V  AAA  AA  A  BBB  Q1  Q2  Q3
0   1  F    0   0  0    0   2   3   5
1   2  G    1   2  3    4   8   5   6
2   3  H    1   2  3    4  10   7   8

For dynamic renaming use:

df2.assign(**pd.DataFrame(m.lookup(s.index,s).tolist()).rename(columns=lambda x:f"Q{x+1}"))

   ID  V  AAA  AA  A  BBB  Q1  Q2  Q3
0   1  F    0   0  0    0   2   3   5
1   2  G    1   2  3    4   8   5   6
2   3  H    1   2  3    4  10   7   8

Or:

df2.assign(**pd.DataFrame(m.lookup(s.index,s).tolist()).add_prefix('Q'))

   ID  V  AAA  AA  A  BBB  Q0  Q1  Q2
0   1  F    0   0  0    0   2   3   5
1   2  G    1   2  3    4   8   5   6
2   3  H    1   2  3    4  10   7   8
anky
  • 74,114
  • 11
  • 41
  • 70
  • This is perfect. Imagine having 100s of columns like Qn. anyway to add dynamic naming to it? like len(list) and then assign numbers 1 to value of len(list). – Prish Jan 16 '20 at 16:13
  • @Prish sure is it okay if the Q starts from 0? like `df2.assign(**pd.DataFrame(m.lookup(s.index,s).tolist()).add_prefix('Q'))` – anky Jan 16 '20 at 16:15
  • 1
    Yeah, I think it will be alright. I have some idea for it in final staging. Thanks for the help @anky_91 – Prish Jan 16 '20 at 16:18
  • @Prish I have added 2 methods for dynamic renaming. Hope it helps. Pleasure – anky Jan 16 '20 at 16:19
  • 1
    Where can I find documentation for '**' you applied to dataframe? I would like to learn more about it. EDIT: never mind. Figured it out. – Prish Jan 16 '20 at 16:20
  • 1
    @Prish I recommend you [this](https://stackoverflow.com/questions/3394835/use-of-args-and-kwargs) – anky Jan 16 '20 at 16:21
5

You can turn your dict into a DataFrame with a MultiIndex then merge.

d2 = {(k, k2): v2 for k,v in dct.items() for k2,v2 in v.items()}

u = pd.DataFrame(d2.values(), index=d2.keys())
#u = pd.DataFrame(d2).T 
u.columns = [f'Q{i+1}' for i in u.columns]

df2.merge(u, left_on=['ID', 'V'], right_index=True)

#   ID  V  AAA  AA  A  BBB  Q1  Q2  Q3
#0   1  F    0   0  0    0   2   3   5
#1   2  G    1   2  3    4   8   5   6
#2   3  H    1   2  3    4  10   7   8
ALollz
  • 57,915
  • 7
  • 66
  • 89
4

my best with DataFrame.lookup.

df2.assign(**pd.DataFrame(dict(enumerate(pd.DataFrame(dct)
                                           .lookup(df2['V'],
                                                   df2['ID']),1)))
               .add_prefix('Q'))

   ID  V  AAA  AA  A  BBB  Q1  Q2  Q3
0   1  F    0   0  0    0   2   8  10
1   2  G    1   2  3    4   3   5   7
2   3  H    1   2  3    4   5   6   8
ansev
  • 30,322
  • 5
  • 17
  • 31