0

I have been stuck at this error for very long. I am trying to use mda function from MDA package in R. After I have fit the model on my data, I am encountering the following error while I am trying to get predictions on the training data itself(same error while predicting on test data)

> Error in mindist[l] <- ndist[l]: NAs are not allowed in subscripted assignments

The error traces back to the predict method of the function.

I am not able to understand the root cause of the error after extensive debugging. Here are some stats about my data:

  1. I am calling the function using rpy2 interface in python since I am working in python and wanted to use this particular R function
  2. The dataset is not very large, around 200 rows and 500-600 columns, I know the features are exceeding the number of samples, but the problem seems to be independent of that since the error comes in predict function and not while training the model, secondly I have tested another dataset as well with no. of rows> no. of features, that gives same error, and most importantly I am anyway using Dimensionality reduction.
  3. The error comes only for a particular configuration: when I DO NOT scale my input data before training. Although, When I scale the data, the error disappears.
  4. Line of Error - error comes at 2nd line, in predict method.
    mda_clf = r['mda'](formula="Class~.", data=Dataset, method=r['polyreg'] )

    training_pred = r['predict'](mda_clf, X_train)
  1. I have checked multiple times , my data does not contains any NAs, I have printed the data before feeding it to the predict function to manually see it and used functions to check for NAs as well.

I am having hard time understanding the meaning of error, my data doesn't contains any NA values, the same function works when I scale the data. But the model trains in both the cases.

Any help would be appreciated.

Arthur Morris
  • 1,253
  • 1
  • 15
  • 21
pranay25
  • 11
  • 4
  • 1
    Preparing a working example will help: https://stackoverflow.com/help/minimal-reproducible-example, https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Arthur Morris Oct 21 '20 at 07:03

1 Answers1

0

I wanted to update that I found the error, actually at a certain point in my script, my X_train, which obviously should have been a matrix of floating point numbers was mistakenly converted into string type. For ex- if a row of X_train should be [0.123, 0.456, 0.542], it was converted into ['0.123', '0.456', '0.542'] in which the values got converted into string type. So, in the following line where the error came

training_pred = r['predict'](mda_clf, X_train)

my model was not able to recognize any value since none were of float type thus giving the error.

pranay25
  • 11
  • 4