Suppose I have some DataFrame:
import numpy as np
import pandas as pd
df = pd.DataFrame(
{
'a': list('abcde'),
'b': list('aaabb')
}
)
And I want to use a sklearn.compose.ColumnTransformer
to transform it:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
transformer = ColumnTransformer(
[
('a', OneHotEncoder(), ['a']),
('b', OneHotEncoder(), ['b']),
]
)
transformer.fit(df)
I can get the feature names from this transformer like so:
transformer.get_feature_names()
# ['a__x0_a', 'a__x0_b', 'a__x0_c', 'a__x0_d', 'a__x0_e', 'b__x0_a', 'b__x0_b']
But how can I get a mapping from the original "parent" feature to each "child" feature?