7

I know ValueError question has been asked many times. I am still struggling to find an answer because I am using inverse_transform in my code.

Say I have an array a

a.shape
> (100,20)

and another array b

b.shape
> (100,3)

When I did a np.concatenate,

hat = np.concatenate((a, b), axis=1)

Now shape of hat is

hat.shape    
(100,23)

After this, I tried to do this,

inversed_hat = scaler.inverse_transform(hat)

When I do this, I am getting an error:

ValueError: operands could not be broadcast together with shapes (100,23) (25,) (100,23)

Is this broadcast error in inverse_transform? Any suggestion will be helpful. Thanks in advance!

o-90
  • 17,045
  • 10
  • 39
  • 63

3 Answers3

9

Although you didn't specify, I'm assuming you are using inverse_transform() from scikit learn's StandardScaler. You need to fit the data first.

import numpy as np
from sklearn.preprocessing import MinMaxScaler


In [1]: arr_a = np.random.randn(5*3).reshape((5, 3))

In [2]: arr_b = np.random.randn(5*2).reshape((5, 2))

In [3]: arr = np.concatenate((arr_a, arr_b), axis=1)

In [4]: scaler = MinMaxScaler(feature_range=(0, 1)).fit(arr)

In [5]: scaler.inverse_transform(arr)
Out[5]:
array([[ 0.19981115,  0.34855509, -1.02999482, -1.61848816, -0.26005923],
       [-0.81813499,  0.09873672,  1.53824716, -0.61643731, -0.70210801],
       [-0.45077786,  0.31584348,  0.98219019, -1.51364126,  0.69791054],
       [ 0.43664741, -0.16763207, -0.26148908, -2.13395823,  0.48079204],
       [-0.37367434, -0.16067958, -3.20451107, -0.76465428,  1.09761543]])

In [6]: new_arr = scaler.inverse_transform(arr)

In [7]: new_arr.shape == arr.shape
Out[7]: True
o-90
  • 17,045
  • 10
  • 39
  • 63
  • thank you for your response, I know, I should have mentioned, I used `MinMaxScaler`. For example: `scaler = MinMaxScaler(feature_range=(0, 1))`. –  Aug 23 '17 at 19:01
  • I tried your answer, it works when I have `fit`, but I have `fit_transform` it gives an error `AttributeError: 'numpy.ndarray' object has no attribute 'inverse_transform'`. I used `fit_transform`. Do you know why this is happening? I am searching about this. –  Aug 23 '17 at 19:12
  • Yes, `fit_transform()` returns a dataset, `fit()` will produce an object from with which you can call other methods. If you read the [docs](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html) you can see `fit()` has no return type, where as `fit_transform()` returns a numpy array. – o-90 Aug 23 '17 at 19:14
  • thank you! That is helpful! I am having a similar issue like this https://datascience.stackexchange.com/questions/22488/value-error-operands-could-not-be-broadcast-together-with-shapes-lstm and this one https://tutel.me/c/programming/questions/42997228/lstmkeras+error+valueerror+nonbroadcastable+output+operand+with+shape+677041+doesn39t+match+the+broadcast+shape+6770412 –  Aug 23 '17 at 19:17
  • @Jesse Ya in both of those questions you need to be doing something like `scaler = MinMaxScaler().fit(dataset)` and then to scale your dataset, do `scaled_data = scaler.transform(dataset)` and then at the end when you are trying to do and inverse_transform, do `scaler.inverse_transform(inv_yhat)` – o-90 Aug 23 '17 at 19:23
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/152667/discussion-between-gobrewers14-and-jesse). – o-90 Aug 23 '17 at 19:23
  • Why would you use an array with the same shape as the input for the inverse transformation? It should of the number of features selected. This does not make sense at all. Inverse transform should project back from PCA space to original space... – Soerendip Feb 25 '21 at 22:09
0

The problem here is that the scaler has the information of your 25-column df, but you have updated your df to 23 columns, so it cannot do the 'inverse' function.

To fix the problem, you can do the fit on the 23-column original dataframe, and then do the 'inverse' on your desired 23-column dataframe.

More info: scaler object keeps track of the information needed to perform the inverse transformation. When you fit a scaler to a dataset using the fit() method, the scaler computes the statistics (such as mean and variance for StandardScaler or minimum and maximum for MinMaxScaler) of the data and stores them in its internal state.

Seb
  • 1
  • 2
-1

It seems you are using pre-fit scaler object of sklearn.preprocessing. If it's true, according to me data that you have used for fitting is of dimension (x,25) whereas your data shape is of (x,23) dimension and thats the reason you are getting this issue.

vipin bansal
  • 878
  • 11
  • 10