I'm struggling to create a function in R that will take in a dataset and columns, and output every permutation of datasets filtered by all of these 3 columns.
My data set looks like
structure(list(name = c("Peter Doe", "John Gary", "Elsa Johnson",
"Mary Poppins", "Jesse Bogart"), sex = c("Male", "Male", "Female",
"Female", "Male"), class = c("Honors", "Core", "Core", "Honors",
"Honors"), grade = c("A", "A", "A", "B", "C")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -5L))
I tried to visualize my goal here:
I was hoping to create new variables based on what path of this map it followed (e.g. male_honors_a <- dataset filtered by those column values) and I think I could do that with the paste function but am not sure here as well. More importantly though, I'm struggling with how to put for loops together inside the function that are able to filter based on the unique values of a column.
I got as far as to coding up a function that creates every subgroup individually but was not able to figure out how to put them together.
subgroups <- function(df, filters, group = "none", name = ""){
listofdfs <- list()
for (i in filters) {
subgroups <- unique(df[[i]])
for (j in subgroups){
x <- df[df[i] == j,]
listofdfs[[paste(name,j, sep = "")]] <- x
}
}
if (group != "none"){
return(listofdfs[[group]])
}
else {
return(listofdfs)}
}
subgroups(df, c("sex", "class", "grade"))
I would hope by running subgroups(df, c("sex", "class"))
, my output would be a list of dataframes:
list(male_honors, male_core, female_honors, female_core)
in which the male_honors
element is
# A tibble: 2 × 4
name sex class grade
1 Peter Doe Male Honors A
2 Jesse Bogart Male Honors C
Would really appreciate any help!