2

BEWARE FORMULAS IN BLACK SO DIFFICULT TO READ IN DARK MODE

I would like to create a matrix of point estimates of \hat{R}_i(Y=y, X)

where Y_i|X_i\sim N(\beta_0+X_i\beta_1,\sigma^2)

and \hat{R}_i=\frac{1}{\sqrt{2\pi\hat{\sigma}^2}}exp\Big(-\frac{1}{2\hat{\sigma}^2}(Y_i-\hat{\beta}_0-X_i\hat{\beta}_1)^2\Big)

for certain set values of Y (e.g. the percentiles of Y).

I have been trying to adjust to my needs code found here and there, starting from a function which is supposed to do something similar, but I haven't been successful.

Ideally I would also like to name the new variables (the point estimates of R for each level of Y) with a name that make sense.

# creating the dataset
set.seed(2020)
x.Gender <- rep(0:1, c(4000,6000)) 
x.Age <- round(abs(rnorm(10000, mean=45, sd=10))) 
Y <- x.Gender * 5 + x.Age / 5 + round(rnorm(10000, mean=0, sd=2)) 
data <- data.frame(x.Age, x.Gender, Y)
head(data)

#estimating the parameters of the normal function
lm.Y <- lm(formula = Y~x.Age+x.Gender, data=data) 

#I can compute point estimate of the values of the normal function at Y for each observation
data$GPS = dnorm(Y, mean = lm.Y$fitted, sd=summary(lm.Y)$sigma)

# I now would like to compute point estimates for a set of values of Y for all observations and store them in a matrix

# I store the quantiles of Y (the values at which I would like to compute the point estimates)
grid_val <- quantile(data$Y, probs = seq(0, .99, by = 0.01))


#this is where I fail:

GPS_grid <- for (i in 1:length(grid_val)) 
{cbind(mutate(sweep(data, 1, STATS=grid_val[i], FUN=dnorm(grid_val[i], mean = lm.Y$fitted, 
                                        sd=summary(lm.Y)$sigma))))}
#here the error message says that dnorm is not a function, character or symbol

Thanks in advance

Change variable name in for loop using R

  • perhaps `stats::dnorm` instead of `dnorm` will help the compiler, and I could be completely wrong. Nope, doesn't help. – Chris Apr 26 '22 at 13:41
  • Could be wrong again, but it looks like your have `STATS`, and `FUN` in swapped positions for `sweep(`, and you're using sweep function positionally, rather than named `STATS=dnorm(` and ... – Chris Apr 26 '22 at 13:51
  • thanks, actually the idea -I might be very wrong- was to use the quantile grid_val[i] as STATS, while the formula I want sweep to perform is dnorm, which delivers the computation of the point estimate R for the value of Y corresponding to grid_val[i] and the value of X corresponding to the value of the covariates for each line. – Luisa Collina Apr 26 '22 at 13:59
  • 1
    `sweep` finalizes after dnorm so `grid_val[i]),`, still naming them will help in the future, now understanding better, `STATS = grid_val[i], FUN = dnorm(grid_val[i]), then lm.Y, perhaps... and just noticed your formulas above, but I'm in 'dark' mode so black on black. – Chris Apr 26 '22 at 14:29

0 Answers0