0

This is a following question of this question.

In my function, I would like to:

  1. create a new data frame with no missing values
  2. center some variables (i.e., iv1 and iv2) and add them to a new data frame as well as add prefix "center_" to the centered variables. That is "centered_man" "centered_woman"

But when I run the code below, I got an error message - Error in[.data.frame(dataset, , c(iv1, iv2, dv)) : object 'man' not found. Can you help me?


# create example data
testData <- data.frame(man = c(9, 8, 3, 4, NA, 8),         
                       woman = c(5, 4, NA, NA, 1, 1),
                       love = c(1, 2, 3, 4, 5, NA))


# define the function

polynomial <- function(iv1, iv2, dv, dataset){
  # create a new data frame with no missing values in iv1, iv2, and dv
  dataTemp <- na.omit(dataset[, c(iv1, iv2, dv)])

  # add the cetnered variables to the new data frame - dataTemp
  dataTemp[, centered_iv1] <- scale(dataTemp[, iv1], center = TRUE, scale = FALSE)
  dataTemp[, centered_iv2] <- scale(dataTemp[, iv2], center = TRUE, scale = FALSE)

  # define the formula
  formula <- substitute(dv ~ centered_iv1 + centered_iv2 + I(centered_iv1^2) + I(centered_iv1 * centered_iv2) + I(centered_iv2^2))

  # run the formula
  model <- lm(formula = formula, data = dataset)
  return(summary(model))
}

# use the function

polynomial(iv1 = man,
           iv2 = woman, 
           dv = love,
           dataset = testData)
wh41e
  • 183
  • 1
  • 3
  • 10

2 Answers2

2

The following assumes that you don't want to change how you call the function. See inline comments:

polynomial <- function(iv1, iv2, dv, dataset){

  ##turn symbols into characters:
  deparsed_iv1 <- deparse(substitute(iv1)) 
  deparsed_iv2 <- deparse(substitute(iv2))
  deparsed_dv <- deparse(substitute(dv))

  # create a new data frame with no missing values in iv1, iv2, and dv
  dataTemp <- na.omit(dataset[, c(deparsed_iv1, deparsed_iv2, deparsed_dv)])

  # add the cetnered variables to the new data frame - dataTemp
  ## use proper quoting to define new variables
  dataTemp[, paste0("centered_", deparsed_iv1)] <- scale(dataTemp[, deparsed_iv1], center = TRUE, scale = FALSE)
  dataTemp[, paste0("centered_", deparsed_iv2)] <- scale(dataTemp[, deparsed_iv2], center = TRUE, scale = FALSE)


  # define the formula
  ## fix the substitution 
  formula <- substitute(dv ~ civ1_symbol + civ2_symbol + I(civ1_symbol^2) + I(civ1_symbol * civ2_symbol) + I(civ2_symbol^2), 
                        list(dv = match.call()[["dv"]],
                             civ1_symbol = as.name(paste0("centered_", deparsed_iv1)),
                             civ2_symbol = as.name(paste0("centered_", deparsed_iv2))))

  # run the formula
  model <- lm(formula = formula, data = dataTemp)
  return(summary(model))
}

# use the function

polynomial(iv1 = man,
           iv2 = woman, 
           dv = love,
           dataset = testData)

#love ~ centered_man + centered_woman + I(centered_man^2) + I(centered_man * 
#    centered_woman) + I(centered_woman^2)
#
#Call:
#lm(formula = formula, data = dataTemp)
#
#Residuals:
#ALL 2 residuals are 0: no residual degrees of freedom!
#
#Coefficients: (4 not defined because of singularities)
#                                 Estimate Std. Error t value Pr(>|t|)
#(Intercept)                           1.5         NA      NA       NA
#centered_man                         -1.0         NA      NA       NA
#centered_woman                         NA         NA      NA       NA
#I(centered_man^2)                      NA         NA      NA       NA
#I(centered_man * centered_woman)       NA         NA      NA       NA
#I(centered_woman^2)                    NA         NA      NA       NA
#
#Residual standard error: NaN on 0 degrees of freedom
#Multiple R-squared:      1,    Adjusted R-squared:    NaN 
#F-statistic:   NaN on 1 and 0 DF,  p-value: NA
Roland
  • 127,288
  • 10
  • 191
  • 288
1

In this line man is undefined:

polynomial(iv1 = man,

Write

polynomial(iv1 = "man",
Niels Holst
  • 586
  • 4
  • 9