2

I'm trying to scale features with such a function

def featureNormalize(X):
'''
This function takes the features as input and
returns the normalized values, the mean, as well 
as the standard deviation for each feature.
'''
X_norm = (X - np.mean(X))/np.std(X)
mu = np.mean(X)
sigma = np.std(X)
return X_norm, mu, sigma

and then call it

X, mean, std = featureNormalize(X) ## We call the function over the features
Train Set Test Set
enter image description here enter image description here

So that it works ok for my train set.

But when I call it for a test set], some columns completely turn into Nan Both sets have no Nan or null values. I've tried to rewrite this function with nanmean and nanstd, but it didn't work :

def featureNormalize(X):
    X_norm = (X - np.nanmean(X, dtype = 'float32')) / np.nanstd(X, dtype = 'float32')
    mu = np.nanmean(X, dtype = 'float32')
    sigma = np.nanstd(X, dtype = 'float32')
    return X_norm, mu, sigma

What can cause this problem and what should I do

Pawara Siriwardhane
  • 1,873
  • 10
  • 26
  • 38
Krutch Dd
  • 33
  • 4
  • 1
    hm, it's not a good idea to normalize train and test sets separately. Test set should be scaled according to train set's std/mean. – Quang Hoang Nov 03 '21 at 16:00
  • So instead of calling the function a second time I used variables from the first invoke `test_x = (test_x - mean) / std`, but it drops another error ** "unsupported operand type(s) for -: 'method' and 'float'"** – Krutch Dd Nov 03 '21 at 16:14

0 Answers0