I'm building an Linear model using OLS in the r package with:
model<-ols(nallSmells ~ rcs(size, 5) + rcs(minor,5)+rcs(change_churn,3)
+rcs(review_rate,0), data=quality,x=T, y=T)
When I want to validate my model using:
validate(model,B=100)
I get the following error:
Error in lsfit(x, y) : only 0 cases, but 2 variables
In addition: Warning message:
In lsfit(x, y) : 1164 missing values deleted
But if I decrease B, e.g., B=10, I works. Why I can't iterate more. Also I notice that the seed has an effect when I use this method. Can someone give me some advice?
UPDATE:
I'm using rcs(review_rate,0) because I want to assign the 0 number of knots to this predictor, according to my DOF budget. I noticed that the problem is with thte data in review_rate. Even if I ommit the parameter in rcs() and just put the name of the predictor, I get errors. This is the frequency of the data in review_rate: count(quality$review_rate)
x freq
1 0.8571429 1
2 0.9483871 1
3 0.9789474 1
4 0.9887640 1
5 0.9940476 1
6 1.0000000 1159
I wonder if there is a relationship with the values of this vector? Because when I built the OLS model, I get the following warning:
Warning message:
In rcspline.eval(x, nk = nknots, inclx = TRUE, pc = pc, fractied = fractied) :
5 knots requested with 6 unique values of x. knots set to 4 interior values.
The values in the other predictors are real positives, but if ommit review_rate predictor I don't get any warning or error.
Thanks for your support.
I add the link for a sample of 100 of my data for replication
https://www.dropbox.com/s/oks2ztcse3l8567/examplestackoverflow.csv?dl=0
X represent the depedent variable and Y4 the predictor that is giving me problems.
require (rms)
Data <- read.csv ("examplestackoverflow.csv")
testmodel<-ols(X~ rcs(Y1)+rcs(Y2)+rcs(Y3),rcs(Y4),data=Data,x=T,y=T)
validate(testmodel,B=1000)
Kind regards,