I would like to keep from an external list:
list <- c("Google", "Yahoo", "Amazon")
The values in the dataframe which have record in the first timestamp (the most old timestamp) in data like this:
dframe <- structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L), name = c("Google",
"Google", "Yahoo", "Amazon", "Amazon", "Google", "Amazon"), date = c("2008-11-01",
"2008-11-02", "2008-11-01", "2008-11-04", "2008-11-01", "2008-11-02",
"2008-11-03")), class = "data.frame", row.names = c(NA, -7L))
The expected output is this:
id name date 1 Google 2008-11-01 1 Yahoo 2008-11-01 1 Amazon 2008-11-04 2 Amazon 2008-11-01 2 Google 2008-11-02
How is it possible to make it?
Using this it keep only the first record for every id and not for every single value from the list which recorded first time in time
library(data.table)
setDT(dframe)
date_list_first = dframe[order(date)][!duplicated(id)]