I think Owe Jessen's comment is right and this is not the answer to the problem. So here is my shot with the help of data.table
.
First, let's use an example that is a little bit easier to comprehend:
library(data.table)
DT <- data.table(AccountID = rep(1:3, each=4),
V1 = sample(1:100, 12, replace=FALSE),
Month = rep(1:4, times=3))
AccountID V1 Month
[1,] 1 81 1
[2,] 1 23 2
[3,] 1 72 3
[4,] 1 36 4
[5,] 2 22 1
[6,] 2 13 2
[7,] 2 50 3
[8,] 2 40 4
[9,] 3 74 1
[10,] 3 83 2
[11,] 3 4 3
[12,] 3 3 4
So here we have 3 accounts and four months and for every account/month combination, we have a V1. So, finding the maximum V1 for each account, I do the following:
setkey(DT, AccountID)
DT <- DT[, list(maxV1=max(V1)), by="AccountID"][DT]
DT[maxV1==V1]
AccountID maxV1 V1 Month
[1,] 1 81 81 1
[2,] 2 50 50 3
[3,] 3 83 83 2
This is a little hard to understand, so let me try to explain it a little: I set AccountID as the key for DT. Now, I basically do two steps in DT[, list(maxV1=max(V1)), by="AccountID"][DT]
. First, I compute the maximum V1 value for each account (DT[, list(maxV1=max(V1)), by="AccountID"]
) and then, by calling [DT]
right after it, I add this new column maxV1
to the old DT
. Obviously, then I only have to get all the rows for which maxV1==V1
holds.
Applying this solution to Nico's more advanced example and showing you how to convert a data.frame
to a data.table
:
library(data.table)
DT <- as.data.table(m)
#Note that this line is only necessary if there are more than one rows per Month/AccountID combination
DT <- DT[, sum(V1), by="Month,AccountID"]
setkey(DT, AccountID)
DT <- DT[, list(maxV1=max(V1)), by="AccountID"][DT]
DT[maxV1==V1]
AccountID maxV1 Month V1
1 24660 1 24660
2 22643 2 22643
3 23642 3 23642
4 22766 5 22766
5 22445 12 22445
...
And this gives exactly 50 rows.
EDIT:
And here is a base-R solution:
df <- data.frame(AccountID = rep(1:3, each=4),
V1 = sample(1:100, 12, replace=FALSE),
Month = rep(1:4, times=3))
df$maxV1 <- ave(df$V1, df$AccountID, FUN = max)
df[df$maxV1==df$V1, ]
I took my inspiration from here.