I have a data frame created this way:
import pandas as pd
d = {'gene' : ['foo', 'qux', 'bar', 'bin'],
'one' : [1., 2., 3., 1.],
'two' : [4., 3., 2., 1.],
'three' : [1., 2., 20., 1.],
}
df = pd.DataFrame(d)
# # List top 5 values
# ndf = df[['one','two','three']]
# top = ndf.values.flatten().tolist()
# top.sort(reverse=True)
# top[0:5]
# [20.0, 4.0, 3.0, 3.0, 2.0]
It looks like this:
In [58]: df
Out[58]:
gene one three two
0 foo 1 1 4
1 qux 2 2 3
2 bar 3 20 2
3 bin 1 1 1
What I want to do is to collapse all values in 2nd column onwards. Get the top 5 scores and identify the corresponding row/column of that selected rows:
Then the desired dictionary will look like this:
{'foo':['two'],
'qux':['one','two','three'],
'bar':['one','two','three']}
How can I achieve that?