I am trying to write a loop where I can subset the dataframe by year and store this in a new dataframe.
df1<-data.frame(ID=c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5),
year=c("2021","2021","2022","2023","2021","2021","2022","2023","2021","2021","2022","2023",
"2021","2021","2022","2023","2021","2021","2022","2023"),
x=c(2,4,5,9,9,7,5,3,2,4,5,9,9,7,5,3,6,8,3,4))
I know I can do this using the subset() function easily, however eventually I will be working with a large dataset with 30+ years so I am assuming there is an easier way to do this than subsetting for each year (lets say in a dataset that contains data from 1990-2020). Basically I would like the results to be similar to:
d2021<-subset(df1,year=="2021")
d2022<-subset(df1,year=="2022")
d2023<-subset(df1,year=="2023")
but using a loop so I do not have to type the above out for each of the 30 years in my actual dataset. I have tried the following based on something I found online but it is not working:
for(i in unique(df1$year)){
if(any(variable.names(df1)==i)){
assign(i,df1[,c(i)])
}
}
which gives me the output
> i
[1] "2023"
I need to subset and store the subsetted data for each year as I will be doing further analysis (MCPs and RSF functions) where data will have to be split by year, and I will need to call the dataframes for each year to run different types of analyses on different years.
I also know the split() function will split my data by year, however this results in a list for each year, not new dataframes.
x<-split(df1,df1$year)
str(x)
List of 3
$ 2021:'data.frame': 10 obs. of 3 variables:
..$ ID : num [1:10] 1 1 2 2 3 3 4 4 5 5
..$ year: chr [1:10] "2021" "2021" "2021" "2021" ...
..$ x : num [1:10] 2 4 9 7 2 4 9 7 6 8
$ 2022:'data.frame': 5 obs. of 3 variables:
..$ ID : num [1:5] 1 2 3 4 5
..$ year: chr [1:5] "2022" "2022" "2022" "2022" ...
..$ x : num [1:5] 5 5 5 5 3
$ 2023:'data.frame': 5 obs. of 3 variables:
..$ ID : num [1:5] 1 2 3 4 5
..$ year: chr [1:5] "2023" "2023" "2023" "2023" ...
..$ x : num [1:5] 9 3 9 3 4