0

I'm having issues with mapply, lists, and user generated functions.

Specifically, I want to generate turn specific variables (cyl, vs, and carb) into factor form, in the mtcars dataset, which for the example is mtcars_example_df1.

I can do this the long way:

mtcars_example_df1$cyl <- as.factor(as.character(mtcars_example_df1$cyl))
mtcars_example_df1$vs <- as.factor(as.character(mtcars_example_df1$vs))
mtcars_example_df1$carb <- as.factor(as.character(mtcars_example_df1$carb))

I want to create a function that will be used as the basis for the mtcars_example_df1 data frame with the factored variables. Here is my plan:

  1. Add a merging_column, which is a column just an index of numbers for the rows.
  2. Use mapply and the user generated function called function_merging_data_frame__turn_dataset_variable_into_factor_form_with_premade_merging_column_b to create a list of data frames with the factor transformed variable and corresponding merging_column only.
  3. Turn the list into 1 data frame with just 1 merging_column
  4. Remove the non-factored variables from the mtcars_example_df1 example data.
  5. Merging the list df and the mtcars_example_df1 data to create the desired dataset.

Unfortunately, I cannot get the user generated function to work.

### creates function to turn into factor form, with merging dataset and premade_merging_column
function_merging_data_frame__turn_dataset_variable_into_factor_form_with_premade_merging_column_b <- 
  # ---- NOTE: turns variable into sum contrasted version of variable
  # ---- NOTE: variable_name ==  variable to be turned to sum contrast
  # ---- NOTE: dataset_name == dataset that contains variable name
  # ---- NOTE: returns data frame column only
  function(variable_name, dataset_name)
  {
    # ---- NOTE: # changes variable_name and dataset_name to object
    colmn1 <- variable_name
    nm1 <- dataset_name
    # ---- NOTE: inserts dataset into function
    dataset_funct_object_A <- 
      data.frame(
        get(nm1)
      )
    # ---- NOTE: transforms data into factor form
    dataset_funct_object_A$factor_variable <- as.factor(as.character(dataset_funct_object_A[[colmn1]]))
    # ---- NOTE: selects specific variables
    dataset_funct_object_B <- 
      dataset_funct_object_A %>% 
      select(dataset_funct_object_A$factor_variable, 
             dataset_funct_object_A$merging_column)
    # ---- NOTE: ## changes colnames
    names(dataset_funct_object_B)[names(dataset_funct_object_B) == "factor_variable"] <- paste(colmn1)
    names(dataset_funct_object_B)[names(dataset_funct_object_B) == "merging_column"] <- paste("merging_column",
                                                                                              colmn1,
                                                                                              sep="__")
    # ---- NOTE: returns appropriate object
    return(dataset_funct_object_B)
  }

### adds merging_column to data
mtcars_example_df1$merging_column <- seq.int(nrow(mtcars_example_df1))

### uses mapply to run factor 
# ---- NOTE: applies functions to appropriate variables
freq_checking_mlm_poisson_follow_up_test_marginal_emmeans_IV_condition_c <- 
  mapply(function_merging_data_frame__turn_dataset_variable_into_factor_form_with_premade_merging_column_b, 
         mtcars_example_df1_factor_variables_df$variable_factor, 
         mtcars_example_df1_factor_variables_df$dataset, 
         SIMPLIFY = FALSE)

Any advice to fix this problem is greatly appreciated.


Here is the code for the practice:

# stack overflow example

## loads appropriate packages
library(tidyverse)

## creates mtcars_example_df1
mtcars_example_df1 <- data.frame(mtcars)

## mtcars_example_df1 data
head(mtcars_example_df1)
colnames(mtcars_example_df1)
str(mtcars_example_df1)

## df with data about data frames and variables to turn into factors
mtcars_example_df1_factor_variables_df <- 
  data.frame(
    variable_factor = c("cyl", "vs", "carb"),
    dataset = "mtcars_example_df1"
  )

## long way of turning into factor

### uses manual input to complete task
mtcars_example_df1$cyl <- as.factor(as.character(mtcars_example_df1$cyl))
mtcars_example_df1$vs <- as.factor(as.character(mtcars_example_df1$vs))
mtcars_example_df1$carb <- as.factor(as.character(mtcars_example_df1$carb))

### checks results
str(mtcars_example_df1$cyl)
str(mtcars_example_df1$vs)
str(mtcars_example_df1$carb)

## short way

### creates function to turn into factor form, with merging dataset and premade_merging_column
function_merging_data_frame__turn_dataset_variable_into_factor_form_with_premade_merging_column_b <- 
  # ---- NOTE: turns variable into sum contrasted version of variable
  # ---- NOTE: variable_name ==  variable to be turned to sum contrast
  # ---- NOTE: dataset_name == dataset that contains variable name
  # ---- NOTE: returns data frame column only
  function(variable_name, dataset_name)
  {
    # ---- NOTE: # changes variable_name and dataset_name to object
    colmn1 <- variable_name
    nm1 <- dataset_name
    # ---- NOTE: inserts dataset into function
    dataset_funct_object_A <- 
      data.frame(
        get(nm1)
      )
    # ---- NOTE: transforms data into factor form
    dataset_funct_object_A$factor_variable <- as.factor(as.character(dataset_funct_object_A[[colmn1]]))
    # ---- NOTE: selects specific variables
    dataset_funct_object_B <- 
      dataset_funct_object_A %>% 
      select(dataset_funct_object_A$factor_variable, 
             dataset_funct_object_A$merging_column)
    # ---- NOTE: ## changes colnames
    names(dataset_funct_object_B)[names(dataset_funct_object_B) == "factor_variable"] <- paste(colmn1)
    names(dataset_funct_object_B)[names(dataset_funct_object_B) == "merging_column"] <- paste("merging_column",
                                                                                              colmn1,
                                                                                              sep="__")
    # ---- NOTE: returns appropriate object
    return(dataset_funct_object_B)
  }

### adds merging_column to data
mtcars_example_df1$merging_column <- seq.int(nrow(mtcars_example_df1))

### uses mapply to run factor 
# ---- NOTE: applies functions to appropriate variables
freq_checking_mlm_poisson_follow_up_test_marginal_emmeans_IV_condition_c <- 
  mapply(function_merging_data_frame__turn_dataset_variable_into_factor_form_with_premade_merging_column_b, 
         mtcars_example_df1_factor_variables_df$variable_factor, 
         mtcars_example_df1_factor_variables_df$dataset, 
         SIMPLIFY = FALSE)

Parfait
  • 104,375
  • 17
  • 94
  • 125
Mel
  • 510
  • 3
  • 10
  • I'm confused... so you just want to extract variables, convert them to factors, and merge them back in? Why not just manipulate the original data frame? – slamballais May 17 '21 at 19:33
  • 1
    Function seems a bit extensive for a simple conversion process. See this [post](https://stackoverflow.com/q/22772279/1422451) to convert multiple columns to a specific type: `mtcars_example_df[c("cyl", "vs", "carb")] <- sapply(mtcars[c("cyl", "vs", "carb")], function(col) as.factor(as.character(col)))` – Parfait May 17 '21 at 19:33
  • 2
    You're loading `tidyverse`, so this all a really long way to write `mtcars_example_df %>% mutate(across(all_of(mtcars_example_df1_factor_variables_df$variable_factor ), ~ factor(as.character(.)))`. – Gregor Thomas May 17 '21 at 19:36
  • 2
    Or perhaps `mtcars_example_df %>% select(all_of(mtcars_example_df1_factor_variables_df$variable_factor )) %>% mutate(across(everything(), ~ factor(as.character(.)))`? – Gregor Thomas May 17 '21 at 19:41
  • @slamballais, I do want to do that using a list. Reason being is that I have a script that I want to use with multiple datasets. If I use a list, then it could potentially reduce the indivdiaul code changes I need to make when modifying a script from 1 dataset to another. – Mel May 17 '21 at 20:46
  • To Parfait and Gregor Thomas, I think your methods should work. WIll try and give update. It is long. Still not good at r, so much of my code is longer than necessary. – Mel May 17 '21 at 20:47

0 Answers0