Problem
I have a data frame where each row marks an exchange between companies, where companies give and receive something on a given date (they can give to a different company or to themselves). From that, I'd like to create a new data frame with columns indicating when a company first started giving, when it first stopped giving, when it first started receiving, and when it first stopped receiving. Here is a sample data frame of what I'm starting with:
Sample Starting Data
samp <- structure(list(giver = structure(c(1L, 2L, 6L, 3L, 1L, 3L, 4L, 1L, 6L, 1L, 5L), .Label = c("A", "B", "C", "X", "Y", "Z"), class = "factor"), receiver = structure(c(1L, 2L, 2L, 3L, 1L, 3L, 3L, 1L, 2L, 1L, 2L), .Label = c("A", "B", "C"), class = "factor"), date = structure(c(1L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 9L), .Label = c("2000-01-01", "2000-01-02", "2000-01-03", "2000-01-04", "2000-01-05", "2000-01-06", "2000-01-07", "2000-01-08", "2000-01-09"), class = "factor")), .Names = c("giver", "receiver", "date"), class = "data.frame", row.names = c(NA, -11L))
samp$date <- as.Date(samp$date, "%Y-%m-%d") # Format date variable
samp
giver receiver date
A A 2000-01-01
B B 2000-01-01
Z B 2000-01-02
C C 2000-01-03
A A 2000-01-04
C C 2000-01-05
X C 2000-01-06
A A 2000-01-07
Z B 2000-01-08
A A 2000-01-09
Y B 2000-01-09
However, I am having trouble figuring out how to scan one column for the first and last occurrences of each company and return the date value of a different column. I found similar questions here and here using match
, duplicated
, or tapply
but can't quite get them to fit with what I am trying to do. Here is a sample data frame of what I'm hoping to end up with:
Desired Ending Data
desire <- structure(list(company = structure(1:6, .Label = c("A", "B", "C", "X", "Y", "Z"), class = "factor"), start.giving = structure(c(1L, 1L, 3L, 4L, 5L, 2L), .Label = c("2000-01-01", "2000-01-02", "2000-01-03", "2000-01-05", "2000-01-09"), class = "factor"), stop.giving = structure(c(5L, 1L, 2L, 3L, 5L, 4L), .Label = c("2000-01-01", "2000-01-05", "2000-01-06", "2000-01-08", "2000-01-09"), class = "factor"), start.receiving = structure(c(1L, 1L, 2L, NA, NA, NA), .Label = c("2000-01-01", "2000-01-03"), class = "factor"), stop.receiving = structure(c(2L, 2L, 1L, NA, NA, NA), .Label = c("2000-01-06", "2000-01-09"), class = "factor")), .Names = c("company", "start.giving", "stop.giving", "start.receiving", "stop.receiving"), class = "data.frame", row.names = c(NA, -6L))
desire
company start.giving stop.giving start.receiving stop.receiving
A 2000-01-01 2000-01-09 2000-01-01 2000-01-09
B 2000-01-01 2000-01-01 2000-01-01 2000-01-09
C 2000-01-03 2000-01-05 2000-01-03 2000-01-06
X 2000-01-05 2000-01-06 <NA> <NA>
Y 2000-01-09 2000-01-09 <NA> <NA>
Z 2000-01-02 2000-01-08 <NA> <NA>