I have the following pipelines, I want to get the features wights with respect to each class. I have three classes ('Fiction','None-fiction','None'). The classifier that I use is SVC
.
Book_contents= Pipeline([('selector', ItemSelector(key='Book')),
('tfidf',CountVectorizer(analyzer='word',
binary=True,
ngram_range=(1,1))),
])
Author_description= Pipeline([('selector', ItemSelector(key='Description')),
('tfidf', CountVectorizer(analyzer='word',
binary=True,
ngram_range=(1,1))),
])
ppl = Pipeline([('feats', FeatureUnion([('Contents',Book_contents),
('Desc',Author_description)])),
('clf', SVC(kernel='linear',class_weight='balanced'))
])
model = ppl.fit(training_data, Y_train)
I have tried eli5 but I got error of mismatch between feature name and classifier.
f1=model.named_steps['feats'].transformer_list[0][1].named_steps['tfidf'].get_feature_names()
f2=model.named_steps['feats'].transformer_list[1][1].named_steps['tfidf'].get_feature_names()
list_features=f1
list_features.append(f2)
explain_weights.explain_linear_classifier_weights(model.named_steps['clf'],
vec=None, top=20,
target_names=ppl.classes_,
feature_names=list_features)
I got this error:
feature_names has a wrong length: expected=47783, got=10528
How to get the rank of features wights with respect to each class? is their a way to do that without eli5?