0

Here's my example data:

df = pd.DataFrame(
    [[4058.502197,105987.898438,106104.445312,122042.820312,'CTRL'],
    [44900.929688,434754.15625,217806.3125,416195.5,'CTRL'],
    [2196.992188,155825.703125,59135.789062,87463.296875,'CTRL2']],
    columns=['TNKS1BP1-492-520','TNKS1BP1-663-683','TNKS1BP1-1023-1043','TNKS1BP1-1664-1676','sample'])

Here's a solution that gets what I want (for a violin plot):

tmp = []
for col in df.columns:
    if col != 'sample':
        tmp_df = df[[col, 'sample']].copy()
        tmp_df.columns = ['abun', 'sample']
        tmp_df['prot_loc'] = col
        tmp.append(tmp_df)

df2= pd.concat(tmp)
df2

abun    sample  prot_loc
0   4058.502197 CTRL    TNKS1BP1-492-520
1   44900.929688    CTRL    TNKS1BP1-492-520
2   2196.992188 CTRL2   TNKS1BP1-492-520
0   105987.898438   CTRL    TNKS1BP1-663-683
1   434754.156250   CTRL    TNKS1BP1-663-683
2   155825.703125   CTRL2   TNKS1BP1-663-683
0   106104.445312   CTRL    TNKS1BP1-1023-1043
1   217806.312500   CTRL    TNKS1BP1-1023-1043
2   59135.789062    CTRL2   TNKS1BP1-1023-1043
0   122042.820312   CTRL    TNKS1BP1-1664-1676
1   416195.500000   CTRL    TNKS1BP1-1664-1676
2   87463.296875    CTRL2   TNKS1BP1-1664-1676

is there a way to get there with stack (or other built in function)? The main issue with stack is it returns a multi index where I want a categorical column.

ie

df3 = df.stack()
df3

0  TNKS1BP1-492-520        4058.502197
   TNKS1BP1-663-683      105987.898438
   TNKS1BP1-1023-1043    106104.445312
   TNKS1BP1-1664-1676    122042.820312
   sample                         CTRL
1  TNKS1BP1-492-520       44900.929688
   TNKS1BP1-663-683       434754.15625
   TNKS1BP1-1023-1043      217806.3125
   TNKS1BP1-1664-1676         416195.5
   sample                         CTRL

Thanks!

Liam McIntyre
  • 334
  • 1
  • 13
  • `df.melt(value_vars=['TNKS1BP1-492-520', 'TNKS1BP1-663-683', 'TNKS1BP1-1023-1043', 'TNKS1BP1-1664-1676'], id_vars='sample', var_name='prot_loc', value_name='abun')` – Nick Apr 12 '23 at 05:51

0 Answers0