I have written code to find the importance of each feature in the entire dataset for multiclass classification. Now I want to find feature importance for each class in multiclass classification, i.e. I want to find the list of features (for each class) that are more important to classify that individual classes.
from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt
model = DecisionTreeClassifier()
model.fit(x3, y3)
importance = model.feature_importances_
for i,v in enumerate(importance):
print('Feature[%0d]:%s, Score: %.6f' % (i,df.columns[i],v))
plt.subplots(figsize=(15,7))
plt.bar([x for x in range(len(importance))], importance)
plt.xlabel('Feature index')
plt.ylabel('Feature importance score')
plt.xticks(rotation=90)
plt.xticks(np.arange(0,len(df.columns)-2, 2.0))
plt.show()
EDIT (28-04-2022):
I read a paper titled Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization; quoting:
On the evaluate section, we fist extract the 80 traffic features from the dataset and clarify the best short feature set to detect each attack family using RandomForestRegressor algorithm. Afterwards, we examine the performance and accuracy of the selected features with seven common machine learning algorithms.
Can anyone explain how this is done?click for picture from that paper