0

I try to implement an example of the xgboost external memory version in R. Please see paragraph 3 from this post:

https://www.r-bloggers.com/2017/01/parallel-computation-with-r-and-xgboost/

I have downloaded the datafile agaricus.txt.train from the link provided:

https://github.com/dmlc/xgboost/tree/master/demo/data

This runs fine after replacing the filename with a link.

dtrain = xgb.DMatrix('C:/test/agaricus.txt.train.txt#train.cache')

Next I would like to replace the data with my own data (from a dataframe).

Do I understand correctly that I need to convert my own dataframe to LIBSVM format? In that case I can try converters like these:

R - convert a data frame to a data set formatted as featureName:featureValue

PS: It would be optimal if I could convert the data below (which are not in LIBSVM format into LIBSVM format) as a test.

data(agaricus.train, package='xgboost')
data(agaricus.test, package='xgboost')
train <- agaricus.train
test <- agaricus.test
user2165379
  • 445
  • 4
  • 20
  • The [`dgCMatrix-class`](https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/dgCMatrix-class.html-class) is a class for sparse matrices defined in base package `Matrix`. See also, for instance, [this R-bloggers post](https://www.r-bloggers.com/2020/03/what-is-a-dgcmatrix-object-made-of-sparse-matrix-format-in-r/). – Rui Barradas Sep 24 '22 at 12:06
  • @ Rui Barradas Thank you. It seems my conclusion was wrong that the files from the downloadlink are in dgCMatrix. If I am correct the files are in LIBSVM format. Can you please confirm if I am correct? In that case I should convert my own data from a dataframe into LIBSVM instead of a dgCMatrix. – user2165379 Sep 26 '22 at 12:09
  • Can [this post](https://stats.stackexchange.com/questions/6755/reading-in-svm-files-in-r-libsvm) help? You should read the libsvm file and create an R object of class dgCMatrix to work with in an R session. – Rui Barradas Sep 26 '22 at 15:34
  • @ Rui Barradas Thank you for your help. Indeed this is a good post. Although the total overall setup becomes too complicated for me with too much room for error. I decided to buy a new GPU with more RAM. Thanks a lot! – user2165379 Sep 28 '22 at 11:07

0 Answers0