0

I have a nested Python dictionary:

d={'CON-2': {'gene-ODF3': [2.0, 44474],'gene-SCGB1C1': [0.184937, 36615], 'gene-TRNAN-GUU-19': [32.0, 443]},'CON-1':{'gene-ODF3': [10.00, 44474], 'gene-SCGB1C1': [0.184937, 36615], 'gene-TRNAN-GUU-19': [30.0, 443], 'gene-LOC103247846': [20.0, 22111]}}

I would like to plot the FPKM of each gene (the first value) against its DNA transcript abundance (the second value) on a scatterplot. I have tried a few different things, such as:

CON_1=pd.DataFrame(d['CON-1'].items(),columns=['FPKM','Fraction-0'])
CON_2=pd.DataFrame(d['CON-2'].items(),columns=['FPKM','Fraction-0'])

df=pd.DataFrame.from_dict({(i,j): d[i][j]
                           for i in d.keys()
                           for j in d[i].keys()},
                           orient='index')

But I cannot separate the two values from each other. I would like to generate a separate data frame for each condition (CON-1 and CON-2), like this:

gene       FPKM    DNA-abundance
gene-ODF3  2.0     44474
Sarah
  • 79
  • 8
  • Does this answer your question? [Nested Dictionary to MultiIndex pandas DataFrame (3 level)](https://stackoverflow.com/questions/30384581/nested-dictionary-to-multiindex-pandas-dataframe-3-level) – Joe Ferndz Dec 08 '20 at 21:38

1 Answers1

0
pd.DataFrame(d)['CON-1'].apply(pd.Series)\
                        .rename(columns={0:'FPKM',1:'DNA-abundance'})
#                        FPKM  DNA-abundance
#gene-ODF3          10.000000        44474.0
#gene-SCGB1C1        0.184937        36615.0
#gene-TRNAN-GUU-19  30.000000          443.0
#gene-LOC103247846  20.000000        22111.0

Same for the other condition.

DYZ
  • 55,249
  • 10
  • 64
  • 93