In the example below, the indices returned by the order
function are used to sort the entries in each group by a :
set.seed(123)
ex.df <- data.frame(
group = sample(LETTERS[1:4],20,replace=TRUE),
score1 = sample(1:10),
score2 = sample(1:10)
)
sortedOrderings <- by(ex.df, ex.df$group, function(df) order(df$score1 + df$score2) )
bestIndices <- lapply(sortedOrderings, FUN= function(lst) lst[1] )
The trouble is that order
sees the indices of the data frame subsetted by by
rather than ex.df
itself, so using it to extract the relevant rows from the ex.df
isn't the brightest idea:
print(sortedOrderings)
ex.df$group: A
[1] 2 3 4 1
---------------------------------------------------------------
ex.df$group: B
[1] 5 3 2 4 1
---------------------------------------------------------------
ex.df$group: C
[1] 2 1 3 4
---------------------------------------------------------------
ex.df$group: D
[1] 3 7 4 6 1 2 5
> print(ex.df[bestIndices,])
group score1 score2
2 D 7 9
5 D 4 1
2.1 D 7 9
3 B 6 6
Is there a way to pull out the "best" row from each group in ex.df
, or at least have the indices reference ex.df
?