I have a data frame of vegetation metrics collected at x units and y sampling stations (multiple stations within each unit) over multiple years. I want to select all the vegetation data for each unit for the most recent year that data has been collected. Here is an example of my data frame:
veg <- c("tree","grass","tree","grass","tree","grass","tree","grass")
cover <- c(0.97,0.21,0.35,0.67,0.45,0.72,0.27,0.67)
unit <- c("U1","U1","U1","U1","U2","U2","U2","U2")
station <- c("A1","A1","A2","A2","A3","A3","A4","A4")
year <- c(2015,2015,2014,2014,2013,2013,2014,2014)
df <- data.frame(veg,cover,unit,station,year)
The data frame looks like this:
veg cover unit station year
1 tree 0.97 U1 A1 2015
2 grass 0.21 U1 A1 2015
3 tree 0.35 U1 A2 2014
4 grass 0.67 U1 A2 2014
5 tree 0.45 U2 A3 2013
6 grass 0.72 U2 A3 2013
7 tree 0.27 U2 A4 2014
8 grass 0.67 U2 A4 2014
I want it to look like this:
veg cover unit station year
1 tree 0.97 U1 A1 2015
2 grass 0.21 U1 A1 2015
3 tree 0.27 U2 A4 2014
4 grass 0.67 U2 A4 2014
Any help would be much appreciated.