I have extension(example .exe,.py,.xml,.doc etc) table in my dataframe. after running on terminal I am getting above error on large data set.
encoder = OneHotEncoder(handle_unknown='ignore')
encoder.fit(features['Extension'].values.reshape(-1, 1))
temp = encoder.transform(features['Extension'].values.reshape(-1, 1)).toarray() #GETTING ERROR on this
print("Size of array in bytes",getsizeof(temp))
print("Array :-",temp)
print("Shape :- ",features.shape, temp.shape)
features.drop(columns=['Extension'], axis=1, inplace=True)
dump(encoder, os.path.join(os.getcwd(), 'model_dumps', 'encoder.pkl'))
features.drop(columns=['Extension'], axis=1, inplace=True)
features = featureScaling(features)
features = np.concatenate((features, temp), axis=1)
OUTPUT -
1) Size of array in bytes :- 8884558912
2) Array :-
[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[1. 0. 0. ... 0. 0. 0.]
[1. 0. 0. ... 0. 0. 0.]
[1. 0. 0. ... 0. 0. 0.]]
3)Shape :- (323310, 8) (323310, 3435)