Hi I am studying AI to build chatbot, i am testing now classification with sklearn, i manage to get good results with following code.
def tuned_nominaldb():
global Tuned_Pipeline
pipeline = Pipeline([
('tfidf', TfidfVectorizer(analyzer=text_process)),
('clf', OneVsRestClassifier(MultinomialNB(
fit_prior=True, class_prior=None))),
])
parameters = {
'tfidf__max_df': (0.25, 0.5, 0.75),
'tfidf__ngram_range': [(1, 1), (1, 2), (1, 3)],
'clf__estimator__alpha': (1e-2, 1e-3)
}
Tuned_Pipeline = GridSearchCV(pipeline, parameters, cv=2, n_jobs=2, verbose=10)
Tuned_Pipeline.fit(cumle_train, tur_train)
my labels are:
- Bad Language
- Politics
- Religious
- General
when i enter any sentence i got most of the time correct label as output. but my problem is, i want to get multiple labels like, if i combine bad language and politics, than it only predicts bad language, how can i get multi label like, bad language + Politics.
I tried to add following code, but i got error that string was not expected for fit mothod.
multiout = MultiOutputClassifier(Tuned_Pipeline, n_jobs=-1)
multiout.fit(cumle_train, tur_train)
print(multiout.predict(cumle_test))
Thanks a lot for your help