0

I was wondering whether it is possible to get the first row of every year per group.

library(data.table)
dt <- data.table(Group = c(rep("A", 4), rep("B", 3), rep("C", 3)), 
                 A = c(1:10), 
                 B = c(10:1), 
                 Year = c(2003:2006, 2004:2006, 2007, 2008, 2009))

The data is as follows

    Group  A  B Year
 1:     A  1 10 2003
 2:     A  2  9 2004
 3:     A  3  8 2005
 4:     A  4  7 2006
 5:     B  5  6 2004
 6:     B  6  5 2005
 7:     B  7  4 2006
 8:     C  8  3 2007
 9:     C  9  2 2008
10:     C 10  1 2009

But what I would like to get is the earliest year per group, but I can't seem to get it right:

dt[min(Year) == Year, by = Group]

How should I do this selection?

Snowflake
  • 2,869
  • 3
  • 22
  • 44

2 Answers2

3

Try:

dt[, .SD[which.min(Year)], by = Group]
arg0naut91
  • 14,574
  • 2
  • 17
  • 38
0

Directly use dplyr, try this :

dt_min <- dt %>% group_by(Group) %>% summarise(new_year = min(Year))
adjustedR2
  • 161
  • 4