1

I had to encode 7 features in One Hot, thus it created sparse matrix as a result. my questions are:

  1. Since I cannot see the actual data behind sparse matrix, I had to scale them first because indexes got messed up. is there any way around it by which it wont create sparse matrix which allow me to play with indexes.
  2. Will that ML model learn from sparse matrix just fine ?
  3. How do I not fall into sparse matrix while OneHotEncoding Multiple features? (I checked if we encode only 2 columns then it won't create sparse matrix, but for 7 it does.)

Below is my code

#Standard Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
Xtrain[:, (3,5)] = sc.fit_transform(Xtrain[:, (3,5)])
Xtest[:, (3,5)] = sc.transform(Xtest[:, (3,5)])

#One Hot Encoding
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers = [('encoder', OneHotEncoder(), [1,2,4,6,7])], remainder = 'passthrough')
Xtrain = ct.fit_transform(Xtrain)
Xtest = ct.fit_transform(Xtest)
  • are you looking for this: https://stackoverflow.com/questions/26576524/how-do-i-transform-a-scipy-sparse-matrix-to-a-numpy-matrix ? – Kirtiman Sinha Aug 24 '20 at 20:09
  • Not exactly Sir.. what I was wondering if sparse matrix will create any issue in machine learning model. also is there any way that you can see if columns transformed successfully?(since sparse matrix doesn't show anything) – Austin Spark Aug 25 '20 at 06:41
  • Can you please provide a minimal, reproducible example of your code? (https://stackoverflow.com/help/minimal-reproducible-example) Your data with "Xtrain" and "Xtest" is undefined, so your current code is not executable. – Kim Tang Aug 27 '20 at 08:44

0 Answers0