14

Say I have three dictionaries

dictionary_col2
{'MOB': [1, 2], 'ASP': [1, 2], 'YIP': [1, 2]}
 dictionary_col3

{'MOB': ['MOB_L001_R1_001.gz',
         'MOB_L002_R1_001.gz'],
 'ASP': ['ASP_L001_R1_001.gz',
         'ASP_L002_R1_001.gz'],
 'YIP': ['YIP_L001_R1_001.gz',
         'YIP_L002_R1_001.gz']}

dictionary_col4

{'MOB': ['MOB_L001_R2_001.gz',
         'MOB_L002_R2_001.gz'],
 'ASP': ['ASP_L001_R2_001.gz',
         'ASP_L002_R2_001.gz'],
 'YIP': ['YIP_L001_R2_001.gz',
         'YIP_L002_R2_001.gz']}

I wanna convert the above dictionaries into a data frame. I have tried the following,

df = pd.DataFrame([dictionary_col2, dictionary_col3, dictionary_col4]) The df data frame looks like,

                ASP MOB YIP
0   [1, 2]  [1, 2]  [1, 2]
1   [ASP_L001_R1_001.gz, ASP_L002_R1_001.gz]    [MOB_L001_R1_001.gz, MOB_L002_R1_001.gz]    [YIP_L001_R1_001.gz, YIP_L002_R1_001.gz]
2   [ASP_L001_R2_001.gz, ASP_L002_R2_001.gz]    [MOB_L001_R2_001.gz, MOB_L002_R2_001.gz]    [YIP_L001_R2_001.gz, YIP_L002_R2_001.gz]

My aim is to have a data frame with the following columns:

    col1  col2 col3              col4 
    MOB   1   MOB_L001_R1_001.gz MOB_L001_R2_001.gz      
    MOB   2   MOB_L002_R1_001.gz MOB_L002_R2_001.gz 
    ASP   1   ASP_L001_R1_001.gz ASP_L001_R2_001.gz 
    ASP   2   ASP_L002_R1_001.gz MOB_L002_R2_001.gz 
    YIP   1   YIP_L001_R1_001.gz YIP_L001_R2_001.gz
    YIP   2   YIP_L002_R1_001.gz YIP_L002_R2_001.gz

Any help/suggestions are appreciated!!

ARJ
  • 2,021
  • 4
  • 27
  • 52
  • Possible duplicate of [Convert list of dictionaries to a pandas DataFrame](https://stackoverflow.com/questions/20638006/convert-list-of-dictionaries-to-a-pandas-dataframe) – michjnich Oct 04 '19 at 06:37

4 Answers4

7
pd.DataFrame({'col2': pd.DataFrame(col2).unstack(),
              'col3': pd.DataFrame(col3).unstack(),
              'col4': pd.DataFrame(col4).unstack()}).reset_index(level=0)

returns

  level_0  col2                col3                col4
0     ASP     1  ASP_L001_R1_001.gz  ASP_L001_R2_001.gz
1     ASP     2  ASP_L002_R1_001.gz  ASP_L002_R2_001.gz
0     MOB     1  MOB_L001_R1_001.gz  MOB_L001_R2_001.gz
1     MOB     2  MOB_L002_R1_001.gz  MOB_L002_R2_001.gz
0     YIP     1  YIP_L001_R1_001.gz  YIP_L001_R2_001.gz
1     YIP     2  YIP_L002_R1_001.gz  YIP_L002_R2_001.gz
eumiro
  • 207,213
  • 34
  • 299
  • 261
  • This solution best works for my question as it generates the column name as well. Thank you! – ARJ Sep 17 '19 at 14:21
4

IIUC, you can do:

pd.concat([pd.DataFrame(d).stack() for d in (d1,d2,d3)], axis=1)

Output:

       0                   1                   2
0 MOB  1  MOB_L001_R1_001.gz  MOB_L001_R2_001.gz
  ASP  1  ASP_L001_R1_001.gz  ASP_L001_R2_001.gz
  YIP  1  YIP_L001_R1_001.gz  YIP_L001_R2_001.gz
1 MOB  2  MOB_L002_R1_001.gz  MOB_L002_R2_001.gz
  ASP  2  ASP_L002_R1_001.gz  ASP_L002_R2_001.gz
  YIP  2  YIP_L002_R1_001.gz  YIP_L002_R2_001.gz
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
4
dict_list = [dictionary_col2, dictionary_col3, dictionary_col4]

df = pd.concat([pd.DataFrame.from_dict(x, orient = 'index').unstack() for x in dict_list], axis = 1)

output:

>>> df

        0   1                   2
0   MOB 1   MOB_L001_R1_001.gz  MOB_L001_R2_001.gz
    ASP 1   ASP_L001_R1_001.gz  ASP_L001_R2_001.gz
    YIP 1   YIP_L001_R1_001.gz  YIP_L001_R2_001.gz
1   MOB 2   MOB_L002_R1_001.gz  MOB_L002_R2_001.gz
    ASP 2   ASP_L002_R1_001.gz  ASP_L002_R2_001.gz
    YIP 2   YIP_L002_R1_001.gz  YIP_L002_R2_001.gz
Brian
  • 1,572
  • 9
  • 18
3

What you can do using concat with explode notice in pandas 0.25.0

pd.concat([pd.Series(x).explode() for x in [d1,d2]],axis=1)
BENY
  • 317,841
  • 20
  • 164
  • 234