I'm currently working on a use case using RandomForestRegressor
. To get training and test data separately based on one column, let's say Home, the dataframe was split into dictionary. Almost done with the modelling, but stuck in getting the feature importance for each of the key in dictionary (number of keys = 21). Please have a look at the codes below:
hp = pd.get_dummies(hp)
hp = {i: g for i, g in hp.set_index(["Home"]).groupby(level = [0])}
feature = {}; feature_train = {}; feature_test = {}
target = {}; target_train = {}; target_test = {}; target_pred = {}
importances = {}
for k, v in hp.items():
target[k] = np.array(v["HP"])
feature[k] = v.drop(["HP", "Corr"], axis = 1)
feature_list = list(feature[1].columns)
for k, v in zip(feature, target):
feature[k] = np.array(feature[v])
for k, v in zip(feature_train, target_train):
feature_train[k], feature_test[k], target_train[k], target_test[k] = train_test_split(
feature[v], target[v], test_size = 0.25, random_state = 42)
What I've tried after a help from Random Forest Feature Importance Chart using Python
for name, importance in zip(feature_list, list(rf.feature_importances_)):
print(name, "=", importance)
but this prints importance for only one of the dictionary (and I don't know which). What I want is to get it printed for all the keys in dictionary "importances". Thanks in advance!