1

I have a dataframe like following:

df:

 Name  S1  S2  S3  S4  Symbol  length
 ENG2  22  34 356 1321  TSPN    820
 ENG2  22  34 356 1321  TSPN    2206
 ENG2  22  34 356 1321  TSPN    3796
 ENG2  22  34 356 1321  TSPN    1025
 ENG5  12  54 876 3421  TNMD    542
 ENG5  12  54 876 3421  TNMD    1339
 ENG6   7  94 456 8261  DPM1    1097
 ENG6   7  94 456 8261  DPM1    1073
 ENG6   7  94 456 8261  DPM1    1161
 ENG6   7  94 456 8261  DPM1    672

In the above table there are multiple names with different lengths. So, based on the length with highest value for the name should be selected. The result should look like following:

Result:

 ENG2  22  34 356 1321  TSPN    3796
 ENG5  12  54 876 3421  TNMD    1339
 ENG6   7  94 456 8261  DPM1    1161

Can anyone tell me how to do this? Thank you !!

beginner
  • 1,059
  • 8
  • 23
  • 1
    Here, you can use the dot notation in a formula, to make it a bit shorter: `aggregate(length ~ ., data=df, max)`. A longer equivalent would be to list variables on the right hand side: `aggregate(length ~ Name + S1 + S2 + S3 + S4 + Symbol, data=df, max)` – lmo Aug 11 '17 at 15:52

0 Answers0