To give you an answer to your question, you would achieve this with lapply
. For instance, consider the following:
Create some sample data:
df <- data.frame(Day = rep(seq.Date(from = as.Date('2010-01-01'), to = as.Date('2010-01-30'), by =1), 5))
df$somevar <- rnorm(nrow(df))
head(df)
Day somevar
1 2010-01-01 -0.946059466
2 2010-01-02 0.005897001
3 2010-01-03 -0.297566286
4 2010-01-04 -0.637562495
5 2010-01-05 -0.549800912
6 2010-01-06 0.287709994
Now, observe that unique
can give you a vector with all unique dates:
unique(df$Day)
[1] "2010-01-01" "2010-01-02" "2010-01-03" "2010-01-04" "2010-01-05" "2010-01-06" "2010-01-07" "2010-01-08" "2010-01-09" "2010-01-10"
[11] "2010-01-11" "2010-01-12" "2010-01-13" "2010-01-14" "2010-01-15" "2010-01-16" "2010-01-17" "2010-01-18" "2010-01-19" "2010-01-20"
[21] "2010-01-21" "2010-01-22" "2010-01-23" "2010-01-24" "2010-01-25" "2010-01-26" "2010-01-27" "2010-01-28" "2010-01-29" "2010-01-30"
This you can pass to lapply to be used for subsetting:
lapply(unique(df$Day), function(x) df[df[,"Day"]==x,])
[[1]]
Day somevar
1 2010-01-01 -0.9460595
31 2010-01-01 -0.3434005
61 2010-01-01 -1.5463641
91 2010-01-01 -0.5192375
121 2010-01-01 -1.1780619
[[2]]
Day somevar
2 2010-01-02 0.005897001
32 2010-01-02 -1.346336688
62 2010-01-02 -0.321702391
92 2010-01-02 -0.384277955
122 2010-01-02 0.058906305
... (output omitted)
where the output of lapply
is a list with the corresponding dataframes.
Needless to say, you would assign this to a name to capture all dataframes in a list as in mylist <- lapply(...)
. However, if you want to have them in your global environment, you can first give each dataframe a name, for instance using setNames
as in setNames(mylist, paste0("df", format(unique(df$Day), format = "%Y%m%d")))
and then you could use list2env(mylist)
to push each list element into the global environment.
However, as mentioned in the comments, this is probably not a good idea. If you want to do something to each date, consider the group-by solution with dplyr
: For instance, imagine you want to get the mean by date:
library(dplyr)
df %>% group_by(Day) %>% summarize(mean_var = mean(somevar))
# A tibble: 30 x 2
Day mean_var
<date> <dbl>
1 2010-01-01 -0.907
2 2010-01-02 -0.398
3 2010-01-03 0.213
4 2010-01-04 -0.142
5 2010-01-05 -0.377
6 2010-01-06 0.404
7 2010-01-07 -0.634
8 2010-01-08 1.00
9 2010-01-09 0.378
10 2010-01-10 -0.0863
# ... with 20 more rows
where each row corresponds to the group-wise mean. This is called split-apply-combine
and is worthwhile googling. It will come again and again.
Just for reference, in base R, you could achieve this using e.g. by
, as in
by(df$somevar, df$Day, FUN = mean)
though either dplyr
or data.table
are probably more user-friendly.