I tried checking some posts like this, this and this but I still couldn't find what I need.
These are the transformations I'm doing:
cat_transformer = Pipeline(steps=[("encoder", TargetEncoder())])
num_transformer = Pipeline(
steps=[
("scaler", MinMaxScaler()),
("poly", PolynomialFeatures(2, interaction_only=True)),
]
)
transformer = ColumnTransformer(
transformers=[
("cat", cat_transformer, cat_features),
("num", num_transformer, num_features),
],
verbose_feature_names_out=False,
)
logit = LogisticRegression
model = Pipeline(
steps=[
("preprocessor", transformer),
("feature_selection", SelectKBest(k=20)),
("logit", logit),
]
)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
Now, I want to get the 20 features selected.
I almost got there after doing:
model["feature_selection"].get_feature_names_out()
However, I got weird names like "x1", "x2", "x15" and so on.
I also tried:
model['preprocessor'].get_feature_names_out()
But that didn't work. Then I tried:
model['feature_selection'].get_support()
And got an array full of booleans (which I assume to be the features selected, but I don't know which feature is in each position). I also tried things like transformer['num'], but that didn't work (since it's a ColumnTransformer).
What can I do to get what features were selected for my model?