2

standardize() in the arm package fails for me when I define the formula object using as.formula and use it in lm(formula, data = df).

Option A (which I don't want) standardizes inputs outside of lm. Option B tries (and fails) to standardize the lm object.

(note: keeping my loop structure since my actual use case is a bit more complicated)

# create data
  library(arm)
  set.seed(324)
  df <- data.frame(y=sample(0:50, 100, replace=T),
                   x1=sample(0:1, 100, replace=T),
                   x2=sample(0:50, 100, replace=T))

# rescale outside of lm for comparison
  df$x1Z <- rescale(df$x1, binary.inputs = "0/1")
  df$x2Z <- rescale(df$x2, binary.inputs = "0/1")

# actual use case has more vars
  var <- c("x1", "x2")
  varZ <- c("x1Z", "x2Z")

# Option A: lm on rescaled
  a <- data.frame(matrix(NA, nrow = 0, ncol = 6))
  for (i in 1:length(var)) {
    formula <- as.formula(paste("y ~", varZ[i])) # use standardized
    m1 <- lm(formula, data = df)
    ms1 <- summary(m1)
    a[i, 1] <- var[i]
    a[i, 2] <- coefficients(ms1)[1,1]  
    a[i, 3] <- coefficients(ms1)[2,1]  
    a[i, 4] <- coefficients(ms1)[2,4]  
    a[i, 5] <- confint(m1)[2,1]        
    a[i, 6] <- confint(m1)[2,2]        
  }
  names(a) <- c("predictor", "intercept", "est", "p", "L95CI", "U95CI")

# Option B: lm, rescaling within lm
  b <- data.frame(matrix(NA, nrow = 0, ncol = 6))
  for (i in 1:length(var)) {
    formula <- as.formula(paste("y ~", var[i]))  # use raw
    m2 <- lm(formula, data = df)
    m2Z <- standardize(m2, binary.inputs="0/1")  # error
    ms2 <- summary(m2Z)
    b[i, 1] <- var[i]
    b[i, 2] <- coefficients(ms2)[1,1]  
    b[i, 3] <- coefficients(ms2)[2,1]  
    b[i, 4] <- coefficients(ms2)[2,4]  
    b[i, 5] <- confint(m2)[2,1]        
    b[i, 6] <- confint(m2)[2,2]        
  }
  names(b) <- c("predictor", "intercept", "est", "p", "L95CI", "U95CI")

Just to show standardize works:

m3 <- lm(y ~ x2, data=df)
standardize(m3, binary.inputs="0/1")
Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
Eric Green
  • 7,385
  • 11
  • 56
  • 102

1 Answers1

3

Use

m2 <- do.call("lm", list(formula = formula, data = quote(df)))

in your loop for Option B.

Your issue is more or less similar to this one: Showing string in formula and not as variable in lm fit. You want to keep a decent formula in m2$call.

If you want to know why this is important, see source code of standardize:

getMethod("standardize", "lm")

This function works by extracting and analyzing $call of an lm object.

Community
  • 1
  • 1
Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248