I have a data frame with 5 columns. The function below creates and outputs 5 small, 3-column datasets featuring the first two columns of my dataset ("country" and "year") and one each of the other 5 columns.
library(dplyr)
# My data (sample)
country <- c("GR", "GR", "GR", "AL", "AL", "AL", "GE", "GE", "GE")
year <- c(1990, 1991, 1992, 1994, 1997, 1996, 1991, 1992, 1993)
pop <- c("i", "i", "j", "j", "j", "i", "i", "i", "i")
category <- c("1", "2", "2", "2", "2", "2", "1", "1", "2")
age <- c(14, 13, 12, 18, 19, 17, 20, 21, 19)
sample_data <- data.frame(country, year, pop, category, age)
rm(country, year, pop, category, age)
# My function
new.datasets <- function(df, na.rm = TRUE, ...){
i=1
for (c in df){
new_df <- select(df, country, year, i)
assign(paste("df_new_", i), new_df, envir = globalenv())
i=i+1
}
}
new.datasets(sample_data)
Using my current function, the first two datasets produced by my function only contain two columns: "country" and "year". The next three datasets produced contain "country", "year" and one each of the remaining columns ("pop", "category" or "age").
I would like to modify my function so that it DOES NOT produce the first two datasets, which only contain "country" and "year". Rather than creating these first two and then removing them, I'd like them to never be produced at all, if possible. Can you help me out?
(Unfortunately, I can't take any easy workarounds like using rm() to remove these datasets afterward, because this is a very simplified version of my actual problem/code, which requires me to remove these datasets as such.)
Thanks! -- New R User