I have credit loan data, but the original df has many loan ids that can be under one customer. thus I need to group by client id in order to build client profile.
the original df:
contract_id', 'product_id','client_id','bal','age', 'gender', 'pledge_amount', 'branche_region
RZ13/25 000345 98023432 2300 32 M 4500 'west'
clients = df.groupby(by=['client_id']).median().reset_index()
This line completely removes important categories like gender, branch region! It groups by client_id and calculates median for NUMERIC columns. all other categorical columns are gone.
I wonder how to group by unique customers but also keep the categoricals..