2

I am new to python. I want to find the top 2 largest values from all the columns for repetitive row elements (i.e. 5 to 4100), and also show its respective row and column index label in output.The largest value should be absolute. (Irrespective of + or - sign) My data structure is like image shown:

My Dataset

df = pd.DataFrame({'E_at_0': [43, -53, 45, -17, 45, 19, 11, 32, 36, 32], 
                   'E_at_10': [-47, 47, 46, -18, 53, 16, 12, 34, -71, -34], 
                   'E_at_20': [56, 43, 41, 29, 36, 14, 13, -37, 43, 38], 
                   'E_at_30': [-46, 16, -40, 31, 42, 15, 63, -39, 52, 39]}, index=[5, 10, 12, 101, 4100, 5, 10, 12, 101, 4100])

df.index.name='Ele_Num'

Ele_num column is my index column.

The row indexs will keep on repeating for 'n'number of times. For every 'nth' data set, I want 2 largest values and their index positions.

code is using pandas

df = pd.read_csv ('trial.csv')

df = df.set_index('Ele_Num')

s = df.abs().stack()

mask = s == s.max()

df1 = df.stack()[mask].reset_index()

df1.columns = ['Element No','Column','Values']

print (df1)

df1.to_csv('trial_output.csv', encoding='utf-8', index=True)

Expected Results:

For 1st set of rows (i.e. for index 5,10,12,101,4100)

                    Ele_Num   E_at_20
                      5          56

For 2nd set of rows (i.e.again for index 5,10,12,101,4100)

                    Ele_Num   E_at_10
                      101       -71

The row indexs will keep on repeating for 'n'number of times. For every 'nth' data set, I want 2 largest values and their index positions.

Actual Results:

What I am getting

  Ele_Num   E_at_30
    101       -71
axay
  • 437
  • 5
  • 19

1 Answers1

1

Use argsort in descending order for positions with absolute values and use it for select stacked values without abs:

N = 4
s = df.abs().stack()
df1 = df.stack().iloc[np.argsort(-s)].head(N).reset_index()
df1.columns = ['Element No','Column','Values']
print (df1)
   Element No   Column  Values
0         101  E_at_10     -71
1          10  E_at_30      63
2           5  E_at_20      56
3          10   E_at_0     -53
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • The problem with code is that it is showing 4 largest values from the whole array.If all 4 largest values belong to the 2nd group of 'Ele_num' row indexes, output does not show 1st group values. I want separate values for each group of Row Index values. – axay May 14 '19 at 11:35
  • 1
    If 2 values are not possible, Could you only provide code to show the largest value fro each group of repetitive row indexes. – axay May 14 '19 at 11:38
  • Not easy, but I try it. – jezrael May 14 '19 at 11:39
  • Thankyou. Instead of top 2, Please only provide the largest value for each group of row indexes. Eg. The Row indexes(5,10,12,101,4100) repeat '15' times in my original dataset. So I want '15' largest values(1 of each group). This **(5,10,12,101,4100)** repeats 15 times in same sequence. – axay May 14 '19 at 11:47
  • @AkshayK. - Sorry, now I am confused. Can you change data for show what you need? – jezrael May 14 '19 at 11:49
  • I have shown only 2 groups of my data set, for which I want 1 max. value for each group of row (5,10,12,101,4100). If you could help me with this, I could apply the logic for my remaining group of rows, and find the max from each. – axay May 14 '19 at 11:55
  • Please Check my Expected Result. – axay May 14 '19 at 12:03