this is a really silly question but I cannot figure out what I am doing wrong,
I have a dataframe with multiple individuals where they could have had data recorded over multiple years. I am trying to create a second dataframe to summarize the year that each individual entered my dataset (and ideally when they left, i.e. the first and last year I have data for them)
df1<-data.frame(ID=c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5),
year=c("2021","2021","2022","2023","2021","2021","2022","2023","2021","2021","2022","2023",
"2021","2021","2022","2023","2021","2021","2022","2023"),
x=c(2,4,5,9,9,7,5,3,2,4,5,9,9,7,5,3,6,8,3,4),
y=c(2,4,5,9,9,7,5,3,2,4,5,9,9,7,5,3,6,8,3,4))
I have tried to group_by(ID) and then summarize the minimum year the following way:
IDs<-df1 %>%
group_by(ID) %>%
summarise(strYear=(min(year))
This ends up giving me only one row with the minimum year. I would like a row for each unique ID and then the minimum year corresponding to that ID.
Thanks in advance!