I have searched high and low for a solution to this, but I cannot find one.....
My dataframe (essentially a table of the no. 1 sports team by date) has numerous occasions where one or various teams would "reappear" in the data. I want to pull out the start (or end) date of each period at no. 1 per team.
An example of the data could be:
x1<- as.Date("2013-12-31")
adddate1 <- 1:length(teams1)
dates1 <- x1 + adddate1
teams2 <- c(rep("w", 3), rep("c", 8), rep("w", 4))
x2<- as.Date("2012-12-31")
adddate2 <- 1:length(teams2)
dates2 <- x2 + adddate2
dates <- c(dates2, dates1)
teams <- c(teams2, teams1)
df <- data.frame(dates, teams)
df$year <- year(df$dates)
which for 2013 looks like:
dates teams year
1 2013-01-01 w 2013
2 2013-01-02 w 2013
3 2013-01-03 w 2013
4 2013-01-04 c 2013
5 2013-01-05 c 2013
6 2013-01-06 c 2013
7 2013-01-07 c 2013
8 2013-01-08 c 2013
9 2013-01-09 c 2013
10 2013-01-10 c 2013
11 2013-01-11 c 2013
12 2013-01-12 w 2013
13 2013-01-13 w 2013
14 2013-01-14 w 2013
15 2013-01-15 w 2013
However, using ddply aggregates the identically-named teams and returns the following:
split <- ddply(df, .(year, teams), head,1)
split <- split[order(split[,1]),]
dates teams year
2 2013-01-01 w 2013
1 2013-01-04 c 2013
3 2014-01-01 c 2014
4 2014-01-09 k 2014
Is there a more elegant way to do this than creating a function which would go through the original df and return a unique value for each subset, add this to the df and then use ddply incorporating the new unique value to return what I want?