0

I am doing a model training of CNN in Python and I have a question. I know that data normalization is important to scale the data in my dataframe between 0 and 1, but let's say I perform z-score normalization on my dataframe VERTICALLY (which means scale the data within the scope of each feature), but after I deployed the model and want to use it on real world scenarios, I only have one row of data in my dataframe (but with same amount of features), I am not able to perform normalization anymore because there is only one data for each feature. The standard deviation will be 0 and division of 0 in z-score is not applicable.

I want to confirm that do I still need to perform data normalization on real world scenarios? If I do not need to, will the result differs because I did normalization during my model training?

lalala
  • 15
  • 7

1 Answers1

2

If you are using a StandardScaler from scikit-learn. You need to save the scaler object and use it for transforming new data after deployment.

Aneesh Das
  • 96
  • 2
  • Oh wow I didn't know that scalar can be saved. Can it perform even if there is only one data? – lalala Sep 04 '21 at 09:34
  • yes, for a single row of data with the same number of features as the scaler. Check this out: https://stackoverflow.com/questions/53152627/saving-standardscaler-model-for-use-on-new-datasets – Aneesh Das Sep 04 '21 at 09:36
  • Thanks with the reference and also the solution! I can't wait to try it out. – lalala Sep 04 '21 at 09:47