0

I am trying to use reduce the dimensionality of my features before using Multinomial NB classifier. Now the thing is, Multinomial NB does not take negative values in X_train. One of the suggestions I found online is to use MinMaxScaler to scale the SVD output to range (0,1) but I am not sure how feasible that is. (Dealing with negative values in sklearn MultinomialNB).

How can I use the TruncatedSVD output as an input to Multinomial NB classifier? Thanks!

Edit: Sample code below.

vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(df[col])

#After applying Truncated SVD 
transformer = TruncatedSVD()
X_red = pd.DataFrame(transformer.fit_transform(X))

The dataset X_red has negative values which I can not use in a Multimonial NB later.

1 Answers1

0

First of all, what is your data? Multinomial NB classifier and TruncatedSVD are both related to Nature Language Processing (NLP). If your data from NLP, then where the negative value from? And what kind of problem do you want to address?

You should also do the data transformation/processing, here MinMaxScale first before applying dimension reduction.

Bill Chen
  • 1,699
  • 14
  • 24
  • Thanks for replying and apologies for the lack of context. You are right, the use case is NLP. I am trying to reduce the dimensionality of my bag of words vector using TruncatedSVD. I currently have a lot of columns in my BoW vector, which I am trying to reduce using Truncated SVD. The negative values come after I run the Truncated SVD on the BoW vector – Vaishak N Chandran Nov 12 '19 at 18:05