0

I'm using betareg function in R to fit a model and then I have to predict the new Y given the new X. These are the data I (read from excel and) use to train the model:

y   x1  x2  x3  x4
0,419634945 1,014238952 1,011464532 1,192017359 1,191387415
0,361534322 1,636566118 1,485164213 1,460187325 1,597295162
0,486509921 1,000498651 1,328485546 1,894474004 1,3618722
0,580568633 1,238241644 1,15981677  1,038092521 1,594942532
0,478963289 1,434048004 1,079663345 1,144157369 1,009562412
0,852646616 1,235856992 1,385227035 1,133296831 1,160178886
0,767787659 1,14211818  1,864746836 1,170483824 1,183169424
0,43807267  1,04318058  1,109918772 1,104738116 1,311819983
0,301183957 1,495354353 1,190626472 1,338857694 1,083106967
0,445263455 1,22351354  1,777189298 1,085002195 1,159102384

and this is the code used:

library(betareg)
library(openxlsx)

input_data<- read.xlsx("dati.xlsx")
Y <- input_data[,1]
X <- input_data[,2:ncol(input_data)]

beta_reg_fit <- betareg(formula = Y ~ data.matrix(X), link = "logit", link.phi = NULL, model = TRUE, y = TRUE, x = FALSE)

new_data <- data.frame(cbind(1.1, 1.2, 1.4, 1.3))

predictions <- predict(beta_reg_fit, new_data)

The variable new_data represents my new observations... but I get the following warning message while using predict

Warning message: 'newdata' had 1 row but variables found have 10 rows

Can anybody help me? I can't find the correct usage of the predict function. I need to train my model on some date and forecast the dependet variable on a (only one) given set of independet variables

Hack-R
  • 22,422
  • 14
  • 75
  • 131
user3758182
  • 119
  • 1
  • 2
  • 8

1 Answers1

3

When you fit a model with the betareg function and then use predict to make predictions, predict tries to find the same names on newdata (not new_data your variable but newdata the parameter of the predict function). In your first case name new_data conflicts with X and hence you get the warning.

To solve your problems you should run this instead:

library(betareg)
library(openxlsx)

input_data<- read.xlsx("dati.xlsx")
Y <- input_data[,1]
X <- input_data[,2:ncol(input_data)]

beta_reg_fit <- betareg(formula = Y ~ data.matrix(X), link = "logit", link.phi = NULL, model = TRUE, y = TRUE, x = FALSE)

X <- data.frame(cbind(1.1, 1.2, 1.4, 1.3))
predictions <- predict(beta_reg_fit, X) 
Erdi
  • 88
  • 8