1

I am trying to perform a naive bayes classifier and get the following error:

Error in density.default(x, na.rm = TRUE, ...) : 
  need at least 2 points to select a bandwidth automatically

I'm only getting this error when I include a numeric variable and oddly enough can perform a density plot on the same numeric variable. The classifier works up until I include the numeric variable and I have no missing data.

Below is my implementation of the predictor:

pred1 <- naive_bayes(x = reg_pred_train[1:4], y = reg_pred_train$WeightedScore, 
                     usekernel = TRUE, laplace = 5)

Packages include dplyr and naivebayes. Below is a sample of the data:

Gender   Coalesce Race  AgeBracket  VP                WeightedScore      
 F:755   Asian   :  15   18-25: 13   Min.   :0.1162    Min.   : 0.000   
 M:878   Black   :  91   25-35: 68   1st Qu.:0.8905    1st Qu.: 5.000    
 U:  5   Hispanic:  24   35-50:258   Median :0.9379    Median : 6.000    
         Other   :   6   50-65:449   Mean   :0.8970    Mean   : 6.145    
         Unknown :  49   65-75:461   3rd Qu.:0.9618    3rd Qu.: 8.000    
         White   :1453   75+  :389   Max.   :0.9933    Max.   :10.000    
  • 2
    [See here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on making an R question that folks can help with. That includes a sample of data and all necessary code including packages used. – camille Dec 27 '19 at 19:09
  • If ypu do not respond to reasonable requests for clarification, your questions will be closed. – IRTFM Dec 28 '19 at 02:15
  • It looks like the reason of the error is following: for the "U" class (Undefined gender) there are five instances. Given undefined gender, in at least one numeric variable there are at least 4 missing values which are removed due to the parameter na.rm=TRUE. Therefore density estimation is not possible and an error is given. – Michal Majka Jan 03 '20 at 14:44

0 Answers0