n'th largest value of rows in a data frame and related column name

Question

Let I have a data frame(df1)

df1:

v1    v2    v3    v4
--    --    --    --
4.1   1.2   12    1.4
14    18.4  15.1  6.9

I want to find nth largest value of each row and also column name of that value.

Foe example, let say I want to find second largest value of each row and related column name. So the output(df2) sould be:

df2:

value   col_name
---     --------
4.1     v1
15.1    v3

How can I do that using R? I will be very glad for any help. Thanks a lot.

@JasonAizkalns, thanks for the warning. I edited the question. — oercim, Apr 12 '16 at 20:00
Would you mind sharing the data in a friendlier format? `dput(df1)` would be great. — Gregor Thomas, Apr 12 '16 at 20:04

JasonAizkalns · Accepted Answer · 2016-04-12T20:17:32.430

This is rough, but gets the job done:

second_largest <- apply(df, 1, FUN = function(x) tail(sort(x), 2)[1])
cols <- which(df == second_largest, arr.ind = T)[, 2]

df2 <- data.frame(value = second_largest,
                  col_name = colnames(df)[cols])

# df2
#   value col_name
# 1   4.1       v1
# 2  15.1       v3

dplyr and tidyr alternative:

library(dplyr)
library(tidyr)

df %>%
  mutate(row = row_number()) %>%
  gather(col, val, -row) %>%
  group_by(row) %>%
  arrange(val) %>%
  top_n(2) %>%
  do(head(., 1))

score 1 · Answer 2 · answered Apr 12 '16 at 20:15

Similar, but slightly different approach. If your data is large, this might be somewhat faster - if not I'm sure no real difference will be noticeable.

n = 2L
mat = as.matrix(df1)
ind = apply(df1, 1, FUN = function(x) which(rank(-x) == n))
data.frame(value = mat[cbind(1:nrow(mat), ind)], col_name = colnames(mat)[ind])
#   value col_name
# 1   4.1       v1
# 2  15.1       v3

n'th largest value of rows in a data frame and related column name

2 Answers2