1

I have the following DataFrame:

df = pd.DataFrame({'A': [['A1', 'A2', 'A3'], ['A1', 'A2']], 
                   'B': [['B1', 'B2', 'B3'], ['B1', 'B2']],
                   'C': [['C1', 'C2', 'C3'], ['C1', 'C2']]
                  })
             A            B            C
     0  [A1, A2, A3] [B1, B2, B3] [C1, C2, C3]
     1  [A1, A2]     [B1, B2]     [C1, C2]

As you can see, the columns 'A', 'B' and 'C' contain lists that can vary in length (but have the same length on each row).

What I would like to do is adding a new column containing a nested dictionary which is a combination of the lists on the same row. For example this would be the resulting dictionary that should be put in a new column (let's call it 'instance_details') of row 0:

{
    'instance_1': {
        'A': 'A1', 
        'B': 'B1', 
        'C': 'C1'
    },
    'instance_2': {
        'A': 'A2', 
        'B': 'B2', 
        'C': 'C2'
    },
    'instance_3': {
        'A': 'A3', 
        'B': 'B3', 
        'C': 'C3'
    }
}

I tried using an intermediate step by merging the lists with zip(), but I couldn't get the result I wanted. In addition to that, I would need to iterate over the lists to create the final dictionary and I don't know what the correct approach to this problem is supposed to be.

Thank you for your help!

Tatsuya
  • 127
  • 5

1 Answers1

1

I will do unnesting

s=unnesting(df,list(df),axis=1).groupby(level=0).apply(lambda x : dict(zip(range(len(x)),x.to_dict('r'))))
0    {0: {'A': 'A1', 'B': 'B1', 'C': 'C1'}, 1: {'A'...
1    {0: {'A': 'A1', 'B': 'B1', 'C': 'C1'}, 1: {'A'...
dtype: object

#s.iloc[0]
#{0: {'A': 'A1', 'B': 'B1', 'C': 'C1'}, 1: {'A': 'A2', 'B': 'B2', 'C': 'C2'}, 2: {'A': 'A3', 'B': 'B3', 'C': 'C3'}}
df['row_dict']=s
BENY
  • 317,841
  • 20
  • 164
  • 234