0

I am trying to predict the next day stock price movement with technical indicators using various machine learning algorithms. However, there seem to be some issues with my data as I always run into some kind of error message when I try to predict, tune the hyperparameters, etc. Can someone see if something is wrong with my data?

When I run the following code I get the error message:

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

see code:

> #10 folds repeat 3 times
> control <- trainControl(method='repeatedcv', 
+                         number=10, 
+                         repeats=3)
> #Metric compare model is Accuracy
> metric <- "Accuracy"
> 
> #Number randomely variable selected is mtry
> mtry <- sqrt(ncol(train))
> tunegrid <- expand.grid(.mtry=mtry)
> rf_default <- caret::train(sign~.-Date -company, 
+                     data=data, 
+                     method='rf', 
+                     metric='Accuracy', 
+                     tuneGrid=tunegrid, 
+                     trControl=control)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels


Here's what my data looks like:

> str(data)
grouped_df [499,585 x 21] (S3: grouped_df/tbl_df/tbl/data.frame)
 $ Date   : Date[1:499585], format: "2002-05-14" "2002-05-14" "2002-05-14" "2002-05-14" ...
 $ company: chr [1:499585] "ATLAS COPCO" "HEXAGON AB" "ASSA ABLOY AB" "VOLVO AB" ...
 $ MOM5   : num [1:499585] 0.0302 0.0437 0.0338 0.0309 -0.0253 ...
 $ MOM10  : num [1:499585] 0.0565 0.019 0.0338 0.0179 -0.1085 ...
 $ MOM14  : num [1:499585] 0.124 0.0437 0.1269 -0.0171 -0.0335 ...
 $ MA5    : num [1:499585] 5.76 2.12 45.2 35.75 86.67 ...
 $ MA14   : num [1:499585] 5.6 2.08 44.08 35.39 88.86 ...
 $ MA30   : num [1:499585] 5.52 2 43.57 35.64 112.76 ...
 $ EMA5   : num [1:499585] 5.77 2.13 45.17 35.74 85.74 ...
 $ EMA14  : num [1:499585] 5.65 2.08 44.39 35.57 92.46 ...
 $ EMA30  : num [1:499585] 5.53 2 43.67 35.68 113.14 ...
 $ sd7    : num [1:499585] 0.008 0.0107 0.019 0.0134 0.0394 ...
 $ sd14   : num [1:499585] 0.014 0.0211 0.0212 0.0145 0.0428 ...
 $ sd21   : num [1:499585] 0.0172 0.0191 0.0251 0.0185 0.0679 ...
 $ atr    : num [1:499585] 1.15 1.15 1.15 1.17 1.3 ...
 $ cci    : num [1:499585] -148 -137 -127 -101 -122 ...
 $ macd   : num [1:499585] -3.65 -3.85 -3.96 -3.79 -4.2 ...
 $ signal : num [1:499585] -2.84 -3.04 -3.23 -3.34 -3.51 ...
 $ rsi    : num [1:499585] 25.4 23.8 23.8 34.9 24.8 ...
 $ volat  : num [1:499585] 0.4 0.414 0.401 0.413 0.499 ...
 $ sign   : Factor w/ 2 levels "0","1": 2 2 1 2 2 2 1 2 1 2 ...
 - attr(*, "groups")= tibble [97 x 2] (S3: tbl_df/tbl/data.frame)
  ..$ company: chr [1:97] "ACTIVE BIOTECH" "ADDNODE GROUP AB" "ADDTECH AB" "AFRY" ...
  ..$ .rows  : list<int> [1:97] 
  .. ..$ : int [1:5151] 56 153 250 347 444 541 638 735 832 929 ...
  .. ..$ : int [1:5153] 24 121 218 315 412 509 606 703 800 897 ...
  .. ..$ : int [1:5152] 25 122 219 316 413 510 607 704 801 898 ...
  .. ..$ : int [1:5151] 26 123 220 317 414 511 608 705 802 899 ...
  .. ..$ : int [1:5150] 57 154 251 348 445 542 639 736 833 930 ...
  .. ..$ : int [1:5153] 3 100 197 294 391 488 585 682 779 876 ...
  .. ..$ : int [1:5153] 1 98 195 292 389 486 583 680 777 874 ...
  .. ..$ : int [1:5150] 27 124 221 318 415 512 609 706 803 900 ...
  .. ..$ : int [1:5151] 28 125 222 319 416 513 610 707 804 901 ...
  .. ..$ : int [1:5150] 58 155 252 349 446 543 640 737 834 931 ...
  .. ..$ : int [1:5150] 14 111 208 305 402 499 596 693 790 887 ...
  .. ..$ : int [1:5152] 59 156 253 350 447 544 641 738 835 932 ...
  .. ..$ : int [1:5153] 29 126 223 320 417 514 611 708 805 902 ...
  .. ..$ : int [1:5152] 30 127 224 321 418 515 612 709 806 903 ...
  .. ..$ : int [1:5152] 31 128 225 322 419 516 613 710 807 904 ...
  .. ..$ : int [1:5153] 32 129 226 323 420 517 614 711 808 905 ...
  .. ..$ : int [1:5152] 60 157 254 351 448 545 642 739 836 933 ...
  .. ..$ : int [1:5155] 33 130 227 324 421 518 615 712 809 906 ...
  .. ..$ : int [1:5148] 15 112 209 306 403 500 597 694 791 888 ...
  .. ..$ : int [1:5154] 34 131 228 325 422 519 616 713 810 907 ...
  .. ..$ : int [1:5152] 61 158 255 352 449 546 643 740 837 934 ...
  .. ..$ : int [1:5152] 62 159 256 353 450 547 644 741 838 935 ...
  .. ..$ : int [1:5152] 63 160 257 354 451 548 645 742 839 936 ...
  .. ..$ : int [1:5150] 64 161 258 355 452 549 646 743 840 937 ...
  .. ..$ : int [1:5149] 65 162 259 356 453 550 647 744 841 938 ...
  .. ..$ : int [1:5150] 66 163 260 357 454 551 648 745 842 939 ...
  .. ..$ : int [1:5146] 10 107 204 301 398 495 592 689 786 883 ...
  .. ..$ : int [1:5146] 16 113 210 307 404 501 598 695 792 889 ...
  .. ..$ : int [1:5150] 67 164 261 358 455 552 649 746 843 940 ...
  .. ..$ : int [1:5147] 68 165 262 359 456 553 650 747 844 941 ...
  .. ..$ : int [1:5149] 17 114 211 308 405 502 599 696 793 890 ...
  .. ..$ : int [1:5152] 35 132 229 326 423 520 617 714 811 908 ...
  .. ..$ : int [1:5148] 69 166 263 360 457 554 651 748 845 942 ...
  .. ..$ : int [1:5149] 18 115 212 309 406 503 600 697 794 891 ...
  .. ..$ : int [1:5152] 36 133 230 327 424 521 618 715 812 909 ...
  .. ..$ : int [1:5150] 6 103 200 297 394 491 588 685 782 879 ...
  .. ..$ : int [1:5153] 2 99 196 293 390 487 584 681 778 875 ...
  .. ..$ : int [1:5152] 37 134 231 328 425 522 619 716 813 910 ...
  .. ..$ : int [1:5151] 38 135 232 329 426 523 620 717 814 911 ...
  .. ..$ : int [1:5153] 70 167 264 361 458 555 652 749 846 943 ...
  .. ..$ : int [1:5152] 71 168 265 362 459 556 653 750 847 944 ...
  .. ..$ : int [1:5148] 39 136 233 330 427 524 621 718 815 912 ...
  .. ..$ : int [1:5151] 72 169 266 363 460 557 654 751 848 945 ...
  .. ..$ : int [1:5151] 40 137 234 331 428 525 622 719 816 913 ...
  .. ..$ : int [1:5151] 41 138 235 332 429 526 623 720 817 914 ...
  .. ..$ : int [1:5150] 42 139 236 333 430 527 624 721 818 915 ...
  .. ..$ : int [1:5148] 11 108 205 302 399 496 593 690 787 884 ...
  .. ..$ : int [1:5151] 73 170 267 364 461 558 655 752 849 946 ...
  .. ..$ : int [1:5152] 74 171 268 365 462 559 656 753 850 947 ...
  .. ..$ : int [1:5151] 75 172 269 366 463 560 657 754 851 948 ...
  .. ..$ : int [1:5153] 76 173 270 367 464 561 658 755 852 949 ...
  .. ..$ : int [1:5151] 43 140 237 334 431 528 625 722 819 916 ...
  .. ..$ : int [1:5153] 77 174 271 368 465 562 659 756 853 950 ...
  .. ..$ : int [1:5150] 44 141 238 335 432 529 626 723 820 917 ...
  .. ..$ : int [1:5150] 19 116 213 310 407 504 601 698 795 892 ...
  .. ..$ : int [1:5152] 78 175 272 369 466 563 660 757 854 951 ...
  .. ..$ : int [1:5150] 45 142 239 336 433 530 627 724 821 918 ...
  .. ..$ : int [1:5146] 8 105 202 299 396 493 590 687 784 881 ...
  .. ..$ : int [1:5147] 46 143 240 337 434 531 628 725 822 919 ...
  .. ..$ : int [1:5152] 79 176 273 370 467 564 661 758 855 952 ...
  .. ..$ : int [1:5149] 47 144 241 338 435 532 629 726 823 920 ...
  .. ..$ : int [1:5153] 80 177 274 371 468 565 662 759 856 953 ...
  .. ..$ : int [1:5153] 81 178 275 372 469 566 663 760 857 954 ...
  .. ..$ : int [1:5151] 82 179 276 373 470 567 664 761 858 955 ...
  .. ..$ : int [1:5149] 83 180 277 374 471 568 665 762 859 956 ...
  .. ..$ : int [1:5148] 84 181 278 375 472 569 666 763 860 957 ...
  .. ..$ : int [1:5149] 85 182 279 376 473 570 667 764 861 958 ...
  .. ..$ : int [1:5149] 86 183 280 377 474 571 668 765 862 959 ...
  .. ..$ : int [1:5149] 87 184 281 378 475 572 669 766 863 960 ...
  .. ..$ : int [1:5148] 48 145 242 339 436 533 630 727 824 921 ...
  .. ..$ : int [1:5150] 88 185 282 379 476 573 670 767 864 961 ...
  .. ..$ : int [1:5148] 7 104 201 298 395 492 589 686 783 880 ...
  .. ..$ : int [1:5147] 49 146 243 340 437 534 631 728 825 922 ...
  .. ..$ : int [1:5148] 50 147 244 341 438 535 632 729 826 923 ...
  .. ..$ : int [1:5147] 51 148 245 342 439 536 633 730 827 924 ...
  .. ..$ : int [1:5150] 89 186 283 380 477 574 671 768 865 962 ...
  .. ..$ : int [1:5150] 90 187 284 381 478 575 672 769 866 963 ...
  .. ..$ : int [1:5151] 91 188 285 382 479 576 673 770 867 964 ...
  .. ..$ : int [1:5151] 20 117 214 311 408 505 602 699 796 893 ...
  .. ..$ : int [1:5146] 9 106 203 300 397 494 591 688 785 882 ...
  .. ..$ : int [1:5147] 52 149 246 343 440 537 634 731 828 925 ...
  .. ..$ : int [1:5151] 92 189 286 383 480 577 674 771 868 965 ...
  .. ..$ : int [1:5150] 93 190 287 384 481 578 675 772 869 966 ...
  .. ..$ : int [1:5149] 94 191 288 385 482 579 676 773 870 967 ...
  .. ..$ : int [1:5151] 95 192 289 386 483 580 677 774 871 968 ...
  .. ..$ : int [1:5149] 96 193 290 387 484 581 678 775 872 969 ...
  .. ..$ : int [1:5150] 21 118 215 312 409 506 603 700 797 894 ...
  .. ..$ : int [1:5147] 53 150 247 344 441 538 635 732 829 926 ...
  .. ..$ : int [1:5149] 12 109 206 303 400 497 594 691 788 885 ...
  .. ..$ : int [1:5152] 22 119 216 313 410 507 604 701 798 895 ...
  .. ..$ : int [1:5151] 5 102 199 296 393 490 587 684 781 878 ...
  .. ..$ : int [1:5150] 13 110 207 304 401 498 595 692 789 886 ...
  .. ..$ : int [1:5153] 23 120 217 314 411 508 605 702 799 896 ...
  .. ..$ : int [1:5152] 97 194 291 388 485 582 679 776 873 970 ...
  .. ..$ : int [1:5149] 54 151 248 345 442 539 636 733 830 927 ...
  .. ..$ : int [1:5153] 4 101 198 295 392 489 586 683 780 877 ...
  .. ..$ : int [1:5149] 55 152 249 346 443 540 637 734 831 928 ...
  .. ..@ ptype: int(0) 
  ..- attr(*, ".drop")= logi TRUE

Summary of my data:

summary(data)
      Date              company               MOM5              MOM10               MOM14                MA5                MA14         
 Min.   :2002-05-14   Length:499585      Min.   :-0.82979   Min.   :-0.836874   Min.   :-0.882682   Min.   :    0.17   Min.   :    0.18  
 1st Qu.:2007-05-02   Class :character   1st Qu.:-0.02230   1st Qu.:-0.033333   1st Qu.:-0.040056   1st Qu.:   14.46   1st Qu.:   14.44  
 Median :2012-04-20   Mode  :character   Median : 0.00000   Median : 0.000000   Median : 0.002491   Median :   43.84   Median :   43.77  
 Mean   :2012-04-21                      Mean   : 0.00234   Mean   : 0.005153   Mean   : 0.007369   Mean   :  258.77   Mean   :  259.21  
 3rd Qu.:2017-04-13                      3rd Qu.: 0.02354   3rd Qu.: 0.038375   3rd Qu.: 0.047985   3rd Qu.:   97.10   3rd Qu.:   97.05  
 Max.   :2022-04-01                      Max.   : 7.49944   Max.   : 8.338513   Max.   : 8.155604   Max.   :62792.45   Max.   :61989.93  
      MA30               EMA5              EMA14              EMA30               sd7               sd14              sd21        
 Min.   :    0.18   Min.   :    0.17   Min.   :    0.18   Min.   :    0.18   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
 1st Qu.:   14.45   1st Qu.:   14.46   1st Qu.:   14.45   1st Qu.:   14.48   1st Qu.:0.01182   1st Qu.:0.01347   1st Qu.:0.01424  
 Median :   43.70   Median :   43.84   Median :   43.79   Median :   43.79   Median :0.01756   Median :0.01885   Median :0.01944  
 Mean   :  260.07   Mean   :  258.78   Mean   :  259.23   Mean   :  260.12   Mean   :0.02250   Mean   :0.02356   Mean   :0.02405  
 3rd Qu.:   96.93   3rd Qu.:   97.12   3rd Qu.:   97.11   3rd Qu.:   96.86   3rd Qu.:0.02655   3rd Qu.:0.02734   3rd Qu.:0.02771  
 Max.   :60515.93   Max.   :62704.20   Max.   :61772.42   Max.   :60137.18   Max.   :1.71659   Max.   :1.21983   Max.   :0.99744  
      atr                 cci                macd               signal               rsi             volat         sign      
 Min.   :   0.0062   Min.   :-666.667   Min.   :-50.48194   Min.   :-47.95839   Min.   :  4.73   Min.   : 0.0000   0:216745  
 1st Qu.:   0.4433   1st Qu.: -80.621   1st Qu.: -1.31229   1st Qu.: -1.25103   1st Qu.: 42.60   1st Qu.: 0.2071   1:282840  
 Median :   1.3016   Median :   7.176   Median :  0.15714   Median :  0.16021   Median : 50.83   Median : 0.2902             
 Mean   :  10.1604   Mean   :   5.896   Mean   :  0.02922   Mean   :  0.02718   Mean   : 51.14   Mean   : 0.3580             
 3rd Qu.:   2.7147   3rd Qu.:  89.816   3rd Qu.:  1.57875   3rd Qu.:  1.52442   3rd Qu.: 59.62   3rd Qu.: 0.4202             
 Max.   :2749.0385   Max.   : 666.667   Max.   : 53.66875   Max.   : 42.20629   Max.   :100.00   Max.   :16.4871
desertnaut
  • 57,590
  • 26
  • 140
  • 166
hellberg30
  • 33
  • 5
  • I'm not familar with the caret package, but you don't appear to have any factors in your model. You've dropped `Date` and `company`. All your other columns are numerics, as shown in the output from `summary()`. – Limey Aug 02 '22 at 15:14
  • As you can see from the output from str(data), "sign" is a factor with two levels. From the summary, you can see the amount of 0 (decrease in stock price) and 1 (increase). I have even converted it to factor by "as.factor" – hellberg30 Aug 02 '22 at 15:18
  • So you do. My apologies. But my point may still hold. You may need at least two factors if contrasts are to be constucted. In any case, the answers to [this question](https://stackoverflow.com/questions/44200195/how-to-debug-contrasts-can-be-applied-only-to-factors-with-2-or-more-levels-er) appear to give far more concrete advice than I ever could. – Limey Aug 02 '22 at 15:42
  • Thank you! I have looked at the answer from the post you refer to. However, the main focus of the solution seems to concern NaN's as a possible reason for the error message. I have no missing values in my dataset. Nonetheless, I was subsetting my data to only include one company in order to speed up the tuning process (I am just testing it out at the moment). Now, it seems to work when I don't subset the data. The code is still running, so it remains to see if that is the root to my problem. – hellberg30 Aug 02 '22 at 16:26

0 Answers0