-1

I compute an array and it has a shape = 800*1140. The ndarray was generated from a previous step and the elements were stacked using hstack. I need to insert this into the scikitlearn for training and I have the following error:

ValueError: Found array with dim 1140. Expected 800

I think my error might be similar to this but I do not know how to proceed.

Can someone give me pointers ? Here is the code that causes the error : error is caused while running line XTrain.....

X_scaled = preprocessing.scale(self.featureMatrix)
imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
X_scaled = imp.fit_transform(X_scaled)
classiFier = svm.SVC(C=10, cache_size=1500, class_weight=None, coef0=0.0, degree=3, gamma=0.0, kernel='rbf', max_iter=-1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False)
XTrain, XTest, yTrain, yTest = cv.train_test_split(X_scaled,
                                                       self.classID,
                                                       test_size=0.4,
                                                       random_state=0)

Here is the entire Traceback:

Traceback (most recent call last):
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 
4.5.3\helpers\pydev\pydevd.py", line 2358, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition
4.5.3\helpers\pydev\pydevd.py", line 1778, in run
pydev_imports.execfile(file, globals, locals)  # execute the script
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition#
4.5.3\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc) 
File "C:/Users/vaidvj/svn/idmt/core/wrappers/afp/test/data/Urban_Sound_DB
/train&testSVM_MF.py", line 133, in <module>
c.process()
File "C:/Users/vaidvj/svn/idmt/core/wrappers/afp/test/data/Urban_Sound_DB
/train&testSVM_MF.py", line 129, in process
self.confMatcal( self.MfeatureMatrix, self.classID, self.uniqueClassLabels)
File "C:/Users/vaidvj/svn/idmt/core/wrappers/afp/test/data/Urban_Sound_DB/train&testSVM_MF.py", line 49, in confMatcal
random_state=0)
File "C:\Anaconda3\lib\site-packages\sklearn\cross_validation.py", line 1556, in train_test_split
arrays = check_arrays(*arrays, **options)
File "C:\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 254, in check_arrays
% (size, n_samples))
ValueError: Found array with dim 1140. Expected 800

thank you.

Community
  • 1
  • 1
Vyas
  • 57
  • 11
  • 1
    Can you post the code that generates the error, difficult for people to reverse engineer code from errors – EdChum Sep 17 '15 at 08:24
  • I will edit the main post. – Vyas Sep 17 '15 at 08:35
  • Can you post Xscaled and classID respective shapes ? (using Xscaled.shape, etc) – P. Camilleri Sep 17 '15 at 08:35
  • 1. I did transpose the X_scaled. 2. The shape of the array before I scaled was 800*1140 and now I see that scaling caused it change shape into 800*1127. The classID is of length 1140. – Vyas Sep 17 '15 at 08:47
  • First of all, shapes are not `xx*yy` but `(xx, yy)`. Then I do not see how 1140 got changed to 1127. X_scaled should be of shape `(1140, 800)`, which you can obtain by transposing your original X_scaled as I suggested: `X_scaled = np.transpose(X_scaled)` – P. Camilleri Sep 17 '15 at 08:50
  • @M.Massias: I added some more code to give you some info. Yes, I know that shapes are (xx,yy). It was a typo, I am sorry. – Vyas Sep 17 '15 at 09:01
  • I'm sorry, I don't understand how you went from 1140 observations to 1127. – P. Camilleri Sep 17 '15 at 09:06
  • after Imputation, it changes - I do not know why ! – Vyas Sep 17 '15 at 09:10
  • But this is not related to your original question. – P. Camilleri Sep 17 '15 at 09:14

1 Answers1

0

You should probably just transpose your data, using numpy.transpose() or yourArray.T. scikit expects an array of shape (n_samples, n_features), where n_samples is your number of observations and n_features is the dimension of the space they live in.

See the doc of np.transpose() for examples.

P. Camilleri
  • 12,664
  • 7
  • 41
  • 76
  • using the transpose will throw an error : ValueError: Found array with dim 800. Expected 1140 – Vyas Sep 17 '15 at 08:29
  • @VyasrajVaidya just try and feed `train_test_split()` two arrays, the first one of shape (1140, 800), the second of shape (1140, )` – P. Camilleri Sep 17 '15 at 09:25