48

On a fresh installation of Anaconda under Ubuntu... I am preprocessing my data in various ways prior to a classification task using Scikit-Learn.

from sklearn import preprocessing

scaler = preprocessing.MinMaxScaler().fit(train)
train = scaler.transform(train)    
test = scaler.transform(test)

This all works fine but if I have a new sample (temp below) that I want to classify (and thus I want to preprocess in the same way then I get

temp = [1,2,3,4,5,5,6,....................,7]
temp = scaler.transform(temp)

Then I get a deprecation warning...

DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 
and will raise ValueError in 0.19. Reshape your data either using 
X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1)
if it contains a single sample. 

So the question is how should I be rescaling a single sample like this?

I suppose an alternative (not very good one) would be...

temp = [temp, temp]
temp = scaler.transform(temp)
temp = temp[0]

But I'm sure there are better ways.

Owen
  • 1,652
  • 2
  • 20
  • 24
Chris Arthur
  • 1,139
  • 2
  • 10
  • 11
  • 3
    Well... you just answered yourself. It's in the warning: `Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.` If your data is not a numpy array then use np.array(data) first. – pzelasko Jan 29 '16 at 10:28

8 Answers8

51

Just listen to what the warning is telling you:

Reshape your data either X.reshape(-1, 1) if your data has a single feature/column and X.reshape(1, -1) if it contains a single sample.

For your example type(if you have more than one feature/column):

temp = temp.reshape(1,-1) 

For one feature/column:

temp = temp.reshape(-1,1)
Prometheus
  • 1,148
  • 14
  • 21
Mike
  • 542
  • 5
  • 2
  • 2
    I don't understand what do they mean by a single sample. X.shape returns (891, 158) . Any of the 2 solutions they propose give an error but I still get that warning if I don't reshape it. – Claudiu Creanga May 12 '17 at 21:21
  • 4
    Dont reshape X! its about y (target) !! y obviously has only one column. Had the same problem some days ago. Cost me an hour trying to reshape X ;-) – Florian H Jul 12 '17 at 13:45
  • So... I wonder how many projects will break the day this finally becomes an error. – sudo Jul 19 '17 at 23:59
  • except that doing as it says (or supplying any valid input whatseover, as far as I can tell) does not make the warning go away. – hwrd Sep 01 '17 at 16:00
  • I appreciated seeing this response. I received the similar error, but being new to python and numpy, the solution wasn't obvious. – joe5 Nov 29 '18 at 01:54
  • 1
    It's kind of sad that the sklearn documentation doesn't give any explanation for this – Alex W Oct 31 '19 at 17:04
  • predict a single sample with one feature: >>> X_new = np.array([13]).reshape(-1, 1) >>> model.predict(X_new) – Max Kleiner Dec 30 '19 at 22:12
34

Well, it actually looks like the warning is telling you what to do.

As part of sklearn.pipeline stages' uniform interfaces, as a rule of thumb:

  • when you see X, it should be an np.array with two dimensions

  • when you see y, it should be an np.array with a single dimension.

Here, therefore, you should consider the following:

temp = [1,2,3,4,5,5,6,....................,7]
# This makes it into a 2d array
temp = np.array(temp).reshape((len(temp), 1))
temp = scaler.transform(temp)
Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
9

This might help

temp = ([[1,2,3,4,5,6,.....,7]])
Bharath M Shetty
  • 30,075
  • 6
  • 57
  • 108
3

.values.reshape(-1,1) will be accepted without alerts/warnings

.reshape(-1,1) will be accepted, but with deprecation war

Analytics
  • 31
  • 2
0

I faced the same issue and got the same deprecation warning. I was using a numpy array of [23, 276] when I got the message. I tried reshaping it as per the warning and end up in nowhere. Then I select each row from the numpy array (as I was iterating over it anyway) and assigned it to a list variable. It worked then without any warning.

array = []
array.append(temp[0])

Then you can use the python list object (here 'array') as an input to sk-learn functions. Not the most efficient solution, but worked for me.

shan89
  • 11
  • 2
0

You can always, reshape like:

temp = [1,2,3,4,5,5,6,7]

temp = temp.reshape(len(temp), 1)

Because, the major issue is when your, temp.shape is: (8,)

and you need (8,1)

vimuth
  • 5,064
  • 33
  • 79
  • 116
0

-1 is the unknown dimension of the array. Read more about "newshape" parameters on numpy.reshape documentation -

# X is a 1-d ndarray

# If we want a COLUMN vector (many/one/unknown samples, 1 feature)
X = X.reshape(-1, 1)

# you want a ROW vector (one sample, many features/one/unknown)
X = X.reshape(1, -1)
0
from sklearn.linear_model import LinearRegression
X = df[['x_1']] 
X_n = X.values.reshape(-1, 1)
y = df['target']  
y_n = y.values
model = LinearRegression()
model.fit(X_n, y)

y_pred = pd.Series(model.predict(X_n), index=X.index)
gregor256
  • 1
  • 1
  • 3
    When answering such an old question, with a highly upvoted accepted answer, please take the time to explain what your new answer is adding to the topic. Please edit your answer is explain how your answer is better/new/improved. – joanis Jul 11 '22 at 22:08