1

I'm trying to pass caret some training data where y is an n x 1 matrix of continuous data. Calling typeof(dfm_y1_train) confirms that it is of type double.

This is the code I'm using:

ctrl <- trainControl(
  method = "repeatedcv",
  number = 20,
  repeats = 3,
  allowParallel = TRUE,
  search = "random",
  verbose = TRUE
)

rf_base <- train(
  x = dfm_X1_train,
  y = dfm_y1_train,
  method = "rf",
  # tuneGrid = tune_grid,
  tuneLength = 20,
  trControl = ctrl,
  num.trees = 1000
)

How I can encourage / convince / force caret to apply regression using a Random Forest?

I also tried using method = "ranger" from Random Forest Regression using Caret, but had the same issue.

[Edit] As requested, some more details and data.

dfm_X1_train:

Note: I anonymised the column names. t_x are uni-grams generated from in "documents".

Document-feature matrix of: 90,264 documents, 2,144 features (99.74% sparse) and 3 docvars.
        features
docs      t_1 t_2    t_3 t_4  t_5     t_6   t_7     t_7 t_8 t_9
  112784    0   0      0   0    0       0     0       0   0   0
  312095    0   0      0   0    0       0     0       0   0   0
  217494    0   0      0   0    0       0     0       0   0   0
  225811    0   0      0   0    0       0     0       0   0   0
  342907    0   0      0   0    0       0     0       0   0   0
  359949    1   1      0   0    0       0     0       0   0   0
[ reached max_ndoc ... 90,258 more documents, reached max_nfeat ... 2,134 more features ]

dfm_y1_train

A matrix: 6 × 1 of type dbl
log_price
1.50851199
3.66356165
3.13331794
2.56494936
-0.01005034
2.99573227
azymandius
  • 13
  • 3
  • 1
    Is it possible that you share the first six rows of your data, x and y? We can help you better with some sample data. – Alexis Jun 18 '21 at 17:05
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input that can be used to test and verify possible solutions. – MrFlick Jun 18 '21 at 17:25
  • @Alexis, MrFlick: hope this helps? – azymandius Jun 19 '21 at 11:21
  • if you read the help page, https://www.rdocumentation.org/packages/caret/versions/4.47/topics/train, y should be a vector, so in your case, if you do `dfm_y1_train = as.numeric(dfm_y1_train)` it should work – StupidWolf Jun 19 '21 at 12:07
  • It should be *me* with the word stupid before my username... thanks heaps! – azymandius Jun 19 '21 at 14:37

0 Answers0