0

I am receiving this error when running a zero-inflated negative binomial: Error in quantile.default(x$residuals) : missing values and NaN's not allowed if 'na.rm' is FALSE

I read other posts where this occurred; some suggested it might be because of NAs in the data, others suggested that the variable in question was not a numeric variable.

I confirmed that the variable avgcontractvalue has no NA's and is numeric. This is what a sample of it looks like:

avgcontractvalue=c(0, 0.01, 0.015, 0.02, 0.025, 0.03, 0.0333333333333333, 0.035, 0.036, 0.0366666666666667, 0.038, 0.04)

I wanted to create sample data to replicate the error, but haven't been able to replicate the error with sample data.

Here is the command I am using:

fit.zinb<-zeroinfl(y~log10(1+bc)*iscovid+avgcontractvalue|1,dist="negbin",link="logit",offset=log(offset),
                          data=df)

Could the issue be perfect separation? No other variables give this error when included in the model. But I'm not sure how to test that the issue might be perfect separation. Any ideas are much appreciated. Apologies for not being able to provide sample data.

str(df) yields:


grouped_df[,6] [2,097,277 x 6] (S3: grouped_df/tbl_df/tbl/data.frame)
 $ tendereren      : chr [1:2097277] "<U+041C><U+041E><U+0417> <U+0423><U+043A><U+0440><U+0430><U+0457><U+043D><U+0438> | 00012925" "<U+041C><U+041E><U+0417> <U+0423><U+043A><U+0440><U+0430><U+0457><U+043D><U+0438> | 00012925" "<U+041A><U+041E><U+041D><U+0421><U+0422><U+0418><U+0422><U+0423><U+0426><U+0406><U+0419><U+041D><U+0418><U+0419"| __truncated__ "<U+0420><U+0410><U+0425><U+0423><U+041D><U+041A><U+041E><U+0412><U+0410> <U+041F><U+0410><U+041B><U+0410><U+042"| __truncated__ ...
 $ bc              : num [1:2097277] 1508804 1508804 1464341 1011615 3110461 ...
 $ iscovid_month   : num [1:2097277] 1 1 1 1 0 1 1 1 1 1 ...
 $ avgcontractvalue: num [1:2097277] 334.22 324.38 0.03 4710.63 256.11 ...
 $ y     : num [1:2097277] 0 0 0 0 0 0 0 0 0 0 ...
 $ offset   : int [1:2097277] 1 2 1 1 1 1 1 1 1 1 ...
 - attr(*, "groups")= tibble[,2] [354,112 x 2] (S3: tbl_df/tbl/data.frame)
  ..$ tendereren: chr [1:354112] "' Elena Mikhailovna | 2349816526" "'kyi municipal enterprise \"BERAPROST\" | 31273334" "'menets'kyi REM | 130760" "'Phase-N' | 40161563" ...
  ..$ .rows     : list<int> [1:354112] 
  .. ..$ : int 492652
  .. ..$ : int [1:7] 541217 639048 651626 938829 1185960 1628479 1679886
  .. ..$ : int [1:3] 256340 658815 679255
  .. ..$ : int [1:9] 265208 289648 339159 417554 649852 1180562 1611103 1878222 1980464
  .. ..$ : int 462567
  .. ..$ : int [1:3] 463915 467303 541675
  .. ..$ : int [1:29] 233664 297616 336391 370997 466842 499574 530920 641250 741655 785447 ...
  .. ..$ : int [1:4] 625705 1814029 1873246 2089453
  .. ..$ : int 972790
  .. ..$ : int [1:2] 1686500 1844516
  .. ..$ : int [1:7] 292888 635219 761081 945664 1011539 1701894 1744966
  .. ..$ : int [1:2] 524536 577889
  .. ..$ : int [1:9] 742395 788185 888699 1135612 1265377 1383289 1804695 1881300 2050082
  .. ..$ : int [1:2] 891246 987278
  .. ..$ : int [1:7] 386067 759192 825774 988015 1362075 1389220 1895858
  .. ..$ : int 937678
  .. ..$ : int 544483
  .. ..$ : int [1:2] 932144 1375352
  .. ..$ : int 926588
  .. ..$ : int [1:28] 224624 333342 382296 501539 536447 556983 799902 832753 853716 892521 ...
  .. ..$ : int 775963
  .. ..$ : int [1:2] 1126090 1157531
  .. ..$ : int 997924
  .. ..$ : int 934959
  .. ..$ : int [1:4] 912765 970666 1861143 1977875
  .. ..$ : int [1:7] 300433 412062 745511 1620143 1895861 1954472 2080261
  .. ..$ : int 744439
  .. ..$ : int [1:24] 288457 344392 405003 482949 740512 787971 822717 831155 976478 1210026 ...
  .. ..$ : int [1:16] 210023 358825 389807 538839 771104 828757 993872 1218611 1222058 1304384 ...
  .. ..$ : int [1:3] 279805 331496 1701169
  .. ..$ : int [1:12] 218928 277545 408455 770974 800538 974122 1388265 1398658 1823772 1883440 ...
  .. ..$ : int 958735
  .. ..$ : int [1:6] 740403 836297 918292 1262856 1263329 1385681
  .. ..$ : int 1001355
  .. ..$ : int [1:3] 752477 778214 1094179
  .. ..$ : int [1:3] 988025 1507189 2017487
  .. ..$ : int [1:2] 1282416 1961778
  .. ..$ : int 1346301
  .. ..$ : int [1:4] 1226928 1753208 2018328 2089892
  .. ..$ : int 1525208
  .. ..$ : int [1:11] 826653 1180482 1192868 1201883 1211342 1217351 1217661 1221824 1475010 1499849 ...
  .. ..$ : int [1:12] 244847 281370 405932 748532 1619008 1690741 1734935 1769544 1857874 1915638 ...
  .. ..$ : int 901251
  .. ..$ : int [1:8] 507746 776781 828212 1267231 1275363 1280279 1297113 1910354
  .. ..$ : int [1:6] 579416 590198 609925 616994 644882 661202
  .. ..$ : int 1939687
  .. ..$ : int 1842098
  .. ..$ : int 1917404
  .. ..$ : int 1837756
  .. ..$ : int 1889649
  .. ..$ : int 298180
  .. ..$ : int 1317058
  .. ..$ : int 302138
  .. ..$ : int 1495113
  .. ..$ : int 2089454
  .. ..$ : int 389229
  .. ..$ : int 1869762
  .. ..$ : int 312236
  .. ..$ : int 2094540
  .. ..$ : int 1856886
  .. ..$ : int [1:2] 299692 390017
  .. ..$ : int 1779820
  .. ..$ : int 1610124
  .. ..$ : int 1943638
  .. ..$ : int 1797556
  .. ..$ : int 298178
  .. ..$ : int 301403
  .. ..$ : int 1975049
  .. ..$ : int 1919491
  .. ..$ : int [1:2] 256104 2088455
  .. ..$ : int 230740
  .. ..$ : int 255871
  .. ..$ : int [1:2] 250898 300869
  .. ..$ : int 419992
  .. ..$ : int 427909
  .. ..$ : int 428759
  .. ..$ : int 410116
  .. ..$ : int 1656987
  .. ..$ : int 301266
  .. ..$ : int 1665254
  .. ..$ : int 864668
  .. ..$ : int 527916
  .. ..$ : int [1:21] 32498 32499 194559 497299 507773 581148 610145 688240 707126 774309 ...
  .. ..$ : int 546047
  .. ..$ : int [1:12] 32500 1025155 1037153 1060193 1064161 1068560 1229433 1438222 1453252 1454780 ...
  .. ..$ : int [1:2] 1109930 1200560
  .. ..$ : int [1:11] 32501 32502 32503 32504 432717 435909 690565 694298 723476 772270 ...
  .. ..$ : int 1386396
  .. ..$ : int [1:6] 32505 32506 699138 704411 711937 758688
  .. ..$ : int [1:7] 358277 391510 1111715 1135284 1331873 1520200 1588250
  .. ..$ : int [1:10] 192583 207459 237364 437979 444860 504226 759684 974135 1175305 1417777
  .. ..$ : int [1:7] 260859 333979 391394 1238986 1597551 1854801 1884383
  .. ..$ : int [1:2] 1084803 1639862
  .. ..$ : int 1156839
  .. ..$ : int [1:2] 1161397 1259682
  .. ..$ : int 1382213
  .. ..$ : int [1:5] 32507 196818 1090314 1100185 1119835
  .. ..$ : int [1:7] 32508 1022205 1025244 1167778 1242423 1417892 1946858
  .. ..$ : int 1858251
  .. .. [list output truncated]
  .. ..@ ptype: int(0) 
  ..- attr(*, ".drop")= logi TRUE

(tendereren is the id variable)

statnerd
  • 23
  • 4
  • You have an unmatched quote in that code for `var-that-gives-error` – IRTFM May 04 '21 at 03:57
  • Nice catch, that was a typo from me copying from the print function. The real data doesn't have quotes in it, it's numeric – statnerd May 04 '21 at 04:20
  • `var-that-gives-error` is not a valid variable name in R. Are you using it just as an example? If you can't provide sample data can you show what does `str(df)` return? What is actual name of `var-that-gives-error` ? – Ronak Shah May 04 '21 at 04:47
  • updated with the ```str(df)``` code. Yes apologies for the confusion, ```var-that-gives-error``` was an example. I updated with actual names – statnerd May 04 '21 at 04:55
  • It's not at all clear why you are presenting a value for `avgcontractvalue`. The error concerns `x$residuals`. Using log functions is a typical way of generating NA values. I would have examined `summary(df)` and especially `summary( df$bc )`. AND I don't see any `y` column in the df object. – IRTFM May 04 '21 at 13:34
  • Searching SO on the error message gives: https://stackoverflow.com/questions/36312771/making-zero-inflated-or-hurdle-model-with-r – IRTFM May 04 '21 at 13:48
  • @IRTFM ah apologies, the ```y``` are updated. I had done ```summary(df)``` and ```summary(df$bc)``` and there are several 0's in the independent variable (99%) that would lend credibility to the idea of perfect separation, as you'd linked. But I'm struggling to understand how to adjust the variable to avoid such perfect separation (if that's possible) – statnerd May 04 '21 at 14:42
  • Tabular examinations might help. Perhaps ‘table(y==0 , avgcontractvalue==0)’ to start. And if that is not illuminating then look at other potential degeneracies. – IRTFM May 04 '21 at 20:24

0 Answers0