I want to estimate predicted values for the mean (mu) and standard deviation (sigma) based on a gamlss model. However, it is not clear to me how to extract a standard deviation for given values of x
The data frame I am using looks like this:
#> head(abdom)
# y x
# 59 12.29
# 64 12.29
# 56 12.29
Here is the code to fit a gamlss model:
library(gamlss)
fit = gamlss(y ~ cs(x), sigma.formula = ~ cs(x), data = abdom, family = BCPE)
I want to calculate z-scores based on this model using the following approach: z = (y - mu)/sigma . Therefore, I use this code to calculate mu and sigma for each value of y and calculate the z scores. 95% of the z scores should lie between -2 and 2.
using predict function
mu = predict(fit, newdata = abdom, type = "response", what = "mu")
si = predict(fit, newdata = abdom, type = "response", what = "sigma")
z_score1 = (abdom$y - mu) / si
hist(z_score1)
using centiles.pred function
z_score2 = centiles.pred(fit, xname = "x", xvalues = abdom$x, yval = abdom$y, type = "z-scores")
hist(z_score2)
This leads to the following plots:
for z_score1, most scores are not even close to lie between -2 and 2.
Another way to approach this is by plotting the mean and standard deviation:
# calculating mu +/- 2*sigma
pred_dat = data.frame(x = 10:45)
mu = predict(fit, newdata = pred_dat, type = "response", what = "mu")
si = predict(fit, newdata = pred_dat, type = "response", what = "sigma")
hi = mu + (2 * si)
lo = mu - (2 * si)
pred_dat$mu = mu
pred_dat$hi = hi
pred_dat$lo = lo
# plotting
ggplot(data = pred_dat, aes(x = x)) +
geom_point(data = abdom, aes(x = x, y = y)) +
geom_line(aes(y = mu), colour = "red") +
geom_line(aes(y = hi), colour = "blue") +
geom_line(aes(y = lo), colour = "blue")
yielding the following plot:
Again, 95% of the values should lie between the two blue lines (hi and lo). But the values of the standard deviations are so low that there seems to be only one line.
So my questions are: first question: what do the values derived from predict represent if not the standard deviation conditional to x?
second question: how can I predict the standard deviation for a given x-value?