Here is the code to use for this question:
set.seed(1337)
myDT <- data.table(Key1 = sample(letters, 500, replace = TRUE),
Key2 = sample(LETTERS[1:5], 500, TRUE),
Data = sample(1:26, 500, replace = TRUE))
setkey(myDT, Key1, Key2)
# showing what myDT looks like
> myDT
Key1 Key2 Data
1: a A 6
2: a A 3
3: a B 2
4: a B 20
5: a B 13
---
496: z D 23
497: z E 3
498: z E 18
499: z E 11
500: z E 2
I would like to pair down myDT
to take only the largest Data values for each Key1, Key2 pair. E.g. (using (Key1,Key2) to denote a pair) for (a,A) I would like to get rid of the row where Data is 3 and keep the row where Data is 6. For (z,E) I would like to keep only the row where Data is 18.
While typing out this question, a solution came to me (which I'll post below) but please help me know how you would approach this problem.