I want to make a new dataframe where students
column must have a unique value and transform the column named program
into other columns according to each category.
To help you understand my problem I provide you my df
as follows:
import pandas as pd
import numpy as np
df=pd.DataFrame({'students':['Salim', 'Salim', 'khaled', 'Raoues', 'Raoues', 'Rafik'],
'program':['MBA', 'MS', 'PHD', 'MS', 'PHD', 'MS'],
'count': [2, 3, 4, 2, 1, 1],
'price': [68, 59, 45, 39, 10, 63],
'Teacher':['Pr Yaici', 'Pr Yaici', 'Dr Zeggagh', 'Dr Zeggagh', 'Dr Zeggagh', 'Pr Yaici']
})
df
So my dataframe has following form:
students program count price Teacher
0 Salim MBA 2 68 Pr Yaici
1 Salim MS 3 59 Pr Yaici
2 khaled PHD 4 45 Dr Zeggagh
3 Raoues MS 2 39 Dr Zeggagh
4 Raoues PHD 1 10 Dr Zeggagh
5 Rafik MS 1 63 Pr Yaici
Goal:
The new_df
I want to create from above df
is:
students programMBA programMS programPHD countMBA countMS countPHD priceMBA priceMS pricePHD Teacher
0 Salim MBA MS NaN 2.0 3.0 NaN 68.0 59.0 NaN Pr Yaici
1 khaled NaN NaN PHD NaN NaN 4.0 NaN NaN 45.0 Dr Zeggagh
2 Raoues NaN MS PHD NaN 2.0 1.0 NaN 39.0 10.0 Dr Zeggagh
3 Rafik NaN MS NaN NaN 1.0 NaN NaN 63.0 NaN Pr Yaici
As you can see each category in column program
has been propagated accordingly to columns count
and price
while the column teacher
is not modified.
Tried methods:
First I wanted to use some encoding methods, but they don't create categorical values as they are. Methods like get_dummies
is useful to create new columns but it doesn't apply in my case.
Your suggestions will be helpful.