2

I Have a numpy array of shape: (5380, 1071)

I am trying to scale it's values using:

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_train =  scaler.fit_transform(X_train)

But, i get this error:

ValueError: Input contains infinity or a value too large for dtype('float64').

I checked the maximum value using print(X_train.argmax()) and the output is just "1027"

This array do not contain any nan or infinity value.

This is not a duplicate of

sklearn error ValueError: Input contains NaN, infinity or a value too large for dtype('float64')

Both errors are different.

John Doe
  • 437
  • 1
  • 5
  • 14
  • I'd say run this `pd.isnull(X_train).sum() > 0` just to make sure no columns are showing some "hidden" nan or inf – chitown88 Dec 29 '18 at 10:56
  • @chitown88 False for all columns – John Doe Dec 29 '18 at 11:00
  • 1
    What output of print(X_train.max()) ? – Farseer Dec 29 '18 at 11:00
  • 1
    Note that `argmax` gives the INDEX of the maximum value, not the maximum value itself. Try `X_train.max()` to find the maximum value – Jondiedoop Dec 29 '18 at 11:01
  • ok, next I'd check to see what the data types are for all columns. `print (X_train.dtypes)` – chitown88 Dec 29 '18 at 11:02
  • aahh, the output of `X_train.max()` is `inf`. Feeling so embarrassed. is it also possible to find the column and row id of the `inf` value? – John Doe Dec 29 '18 at 11:04
  • I'm having the same issue. My max is 20434.95 using `df[cols].max().max()`. – Echochi Jun 17 '21 at 18:57
  • Max didn't show the infinite values, but a check with `np.isinf(df[cols]).any()` finally revealed the culprits. I'm using [this thread](https://stackoverflow.com/questions/17477979/dropping-infinite-values-from-dataframes-in-pandas) to find and replace them. – Echochi Jun 17 '21 at 19:08

0 Answers0